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Editor's Notes 



Assessment is a potent tool in shaping directions for higher education. Leg- 
islators arc interested in it Administrators are mystified by it Practitionen 
are challenged by it Faculty are afraid of it Students are affected by it 
What to do, how to do it, and why it should be done are being asked on 
many levels. In the 1987 education environment assessment can be defined 
as the activities of testing, evaluation, and documentation, Standaidized 
testing is only one of a number of avenues available. 

Almost without exception, recent writers on reform in higher educa- 
tion address the issue of assessment While some place the responsibility 
with Oie individual institution, others urge movement at the slate level. 
And movement has occurred A recent survey of the fifty slates found that, 
while few had formal assessment mechanisms in place at die slate level only 
a yea - or two ago, two thirds now report that diey do if die term assessment 
is not limited to traditional and narrow definitions (Boyer, Ewell, Finney, 
and Mingle, 1987). In contrast to die mandated statewide testing programs 
that are typically envisioned for state-level assessment, diese auihon describe 
a mosaic of state initiatives diat extend assessment initiatives to early inter- 
vention programs, incorporate assessment into existing planning and 
accountability mechanisms, and redefine assessment as including die moni- 
toring of other outcomes, such as student retention and graduate satisfaction. 
Moreover, most of die state higher education executive officers surveyed 
believe diat assessment plans should be developed locally and that diey 
should reflect die institutional mission. 

The cunent literature discusses community colleges as a component 
of pmtsecondary education, subiea to die same standards as odier institu- 
tions. We acknowledge diat we cannot discuss assessment for community 
colleges as separate from die dialogue oi: assessment for four-year colleges 
and universities. In fact, community colleges have a particularly urgent 
mandate to join in die dialogue, shape die assessment models, and present 
dieir findings and outcomes to die public The traditional response to die 
calls to improve higher education has been to raise entrance standards, and 
die survey bv Boyer, Ewell, Finney, and Mingle (1987) indiates diat ^me 
states are again considering diis response. Community colleges aie opra- 
door institutions. If diey are to retain their mission, they have die obliga- 
tion to present odier responses to die demands for accountability dirough 
assessment 

In a review of state-mandated testing and educational reform, Aira- 
sian (1987) considcn die new roles being asked of asvrssment, especially 
stote-mandatcd assessment Airasian notes diat an emphasis on the technical 
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aspects of testing will not suffice, since the cnidal issues are social, eco- 
nomic, and value laden. It is appropriate, then, that the contents of this 
volume are much man dian a how-to guide. The chapters cover »hree areas 
of assessment: accounubility issues and the political tensions that they 
itflect; assessment practices, the use and misuse of testing, and emerging 
directions; and the impact of assessment, which includes issues of student 
access and opp(munity, technological applications, expanded models for 
assessment, and inocased linkages between hig^ schook and colleges as a 
result of asstssment infcnination. Finally, this volume suggests tlie need to 
focus on the next challenge: to take assessment beyond its presendy politi- 
cally mandated stages to iu rightful puipose— improving the curriculum 
and the nuality of teaching and learning widiin the institution. 

To introduce accountability issues, Danid Resnick offen a histcMical 
perspective on testing and American education. He argues that the tensions 
and solutions once faced by the public school sector are now being encoun- 
tered in the arena of higher education. In Chapter Two, Peter M. Hirsch 
explores the relationship between mandates for educational excellence and 
increased standards and access to educational opportunity for all students. 
He undcrsoms the difference between aocounubiliiy-based assessment a*vl 
compliance-based testing. In Chapter Three, John Losak argues that rigor 
in classroom assessment is the only way of reducing outside interference in 
die assessment jKocess. He recommends that we reduce the role of individual 
instructors in assessment 

The area of assessment practices covers a wide variety <rf topics. One 
approach advocated wvJh increasing frequency but as yet seldom imple- 
mented is called value-added testing. In Chapter Four, Marda Belcher syn- 
thesizes the arguments for and against such an approa h ana v!<wib<:$ 
several alternatives. In Chapter Five, Scarvia AiKierson examines die assess- 
ment method most often used (and abused) in higher education today: the 
teacher-made test 

Two practices are increasingly common components of the testing 
arsenal: placement testing and large-scale essay testing. In Chapter Six, 
Linda Crocker describes ways of overcoming some of the common pitfalb 
of essay testing and scoring. In Chapter Seven, Edward Moninte critiques 
placement test practices and models and offers guidelines for the develop- 
ment of an appropriate placement testing system, and in Chioter Eight. 
Emmett Casey discusses ways in which testing practices can be modified to 
meet the special needs of disabled students. 

The last face of assessment considered in this volume reflects the 
trends diat are likely to develop as a result of the increased attention to 
assessment Roy McTamaghan argues in Chapter Nine diat assessment does 
not necessarily affect minorities negatively. In Chapter Ten, Jeanine 
Rounds, Mardia Kantcr, and Marlene Blumin consider the impaa of emerg- 
ing Technology on testing, and in Chapter Eleven, Susan Obler and Mau- 
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recn Ramcr point out that the designers of assessment and counseling sys- 
terns need to consider pc^lations other than recent high schod graduates 
and to envision systems that aoxmmKxiate individual education planning 
and career goals. In the concluding chapter, Jim Palmer cites recent publi- 
cations that address die issues raised in this volume. 

The contributes b^gan from the premise that colleges must restore 
public confidence in their quality and effectiveness. We conclude b> sug- 
gesting that the effective institution will no longer focus only on assessing 
iu students* abilities but also on using assessment infmnation to improve 
Its cuuiculum and the quality of the teadiing-leaming process. In their 
efforu to restore public confidence throu^ assessment, colleges must appre- 
ciate that standaidiaed testing is only one of many tools. Colleges must 
learn to use assessment to provide information that documents past successes 
and future nfeds and that helps to improve the curriculum. 



Dorothy Bray 
Marda J. Belcher 
Editors 
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Student assessment efforts are historically linked to the ebb and 
flow of public confidence in the nation's schools and colleges. 



Expansion, Quality, 

and Testing 

in American Education 

Daniel P. Resnick 



The United States has just completed a momentous expansion of its system 
of higher educauon. That expansion was sustained over a period of about 
thirty years, between 1954 and 1983. During that period, enrollments 
increased on average close to 6 percent each year and for the first twenty 
years at an average rate of 7.6 percent (National Center for Education Statis- 
ucs. 1973. 1985; Bureau of the Census. 1976). Major changes occurred in the 
postsecondary structure as it grew and adapted to the needs of a growing 
student population. New kinds of institutions, such as the community col- 
leges, took on an irrportant role. Large state institutions became multiversi- 
ues. and the liberal arts colleges became increasingly vocationally oriented. 
The pattern of majors for students shifted, as did the timing and sequence 
of the years of undergraduate education. 

Today, about 3.000 accredited colleges and universities in the United 
States enroll dose to ten million undergraduate students. At the beginning of 
the expansion, there were about 2.000 accredited colleges and universities 
and three million undergraduate students. During these three decades, the 
number of institutions of higher education increased by 50 percent and the 
student enrollment tripled By the end of die period of expansion just de- 
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scribed, in academic year 1982-83. national enrollments in each of the major 
types of postsecondaiy institution were either holding steady or dedmmg. 

The end of expansion poses questions about the future of many 
institutions. All tace problems involved in maintaining enrolhnents. estab- 
lishing or sustaining quality programs, securing adequate finanang, and 
maintaining public confidence. The selective institutions-that is. institu- 
tions that are able to turn away at least one student for each student 
accepted-are in the most favored positions, but they are no more than fifty 
or so in number, and perhaps only half can be more selective (Fiske. 1985). 
Most institutions of higher education see rather lean years ahead. 

The present situation is not yet a crisis, but the problems are real. 
The supply of places exceeds the demand. A number of institutions Imve 
insufficient funds to maintain operations. Large segments of the lay public 
have little confidence in the quality and effectiveness of higher educauon. 
In contrast to the problems of the high schools, the problems of the colleges 
and universities are not yet at center stage, and there is certainly no consen- 
sus on what ought to be done. 

Nonetheless, the problems will receive increasing attenuon m the 
years ahead. Political actors and scholars are pointing fingers. Secretary of 
Education William Bennett has called on college and university leaders, 
first in October 1984 and then on a number of subsequent occasions, to find 
ways to show the public that their institutions make a valued diffoeme in 
the education and growth of students. Governors have called on umveraues 
and colleges to show their contribution to more efficient learmng. Severd 
state legislatures are refusing to maintain hinding for state umversiues and 
community colleges without prior demonstrations that current subsidies 
tave been used effectiwly. In his examination of a number of recent studies 
of undergraduate education. Hacker (1986) expressed a similar doubt about 
the quality and effectiveness of higher education. 

How can we gain perspective on these developments? To students of 
American higher education, the cunent problems have a familiar ring 
because they suggest the problems that foUowed the half century of expan- 
sion of the system of secondary education in the United States in the penod 
between 1890 and 1935. During that j, jriod enrollments increased on average 
almost 8 percent each year, with a peak close to 9 percent in the years 
between 1909 and 1924. The number of public high school diplomas 
awarded inaeased on average 7.9 percent each year during that period; in 
the peak years, the average inaease was 9.8 percent (Bureau of the Census. 
1976) Although there are obvious differences between insutuuons of sec- 
ondary education and institutions of higher education, we propose this 
analogy because there are common features in the pressures behmd expan- 
sion in the two periods: certain common features in the kinds of transfor- 
mations undergone by educational institutions, certain common problems 
in maintaining the confidence of the public in the quality and effecuveness 
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of changing institutions, and certain common strategies for maintaining 
this confidence. At the same time, the comparison makes us aware that the 
problems of higher education today are distinctive and that they require 
new remedies. The analogy is imperfect but useful. 

During diis period of rapid expansion between 1890 and 1935, the 
United Slates became the first Western nation to bring a substantial portion 
of its school-age population into secondaiy schools. France, Germany, and 
Great Britain did not begin a comparable expansion until after the Second 
World War (Heidenheimer, 1973). The rate of expansion of the secondary 
schools then exceeded the increase in the school-eligible population, which 
had been swollen during most of that period by the heaviest immigration 
rates in our history (Wagner, 1971). As scholars have argued, the commit- 
ment to schooling was driven by a belief in education as a source of moral 
improvement, common to both Protestant and rationalist traditions in our 
sodeiy (Welter, 1962). 

In 1890, it can be estimated that fewer than 15 percent of the four- 
teen- to seventeen.year-olds in our society were in higfi schools. By 1935, the 
figure had leaped to more than 70 percoit In 1890, litde more tfian 6 per- 
cent of the seventeen- and eighteen-year-olds completed high school. By 
1935, almost half of those in that age group had done so (Bureau of the 
Census, 1976). During these years, the costs of school consUiiction and 
teacher salaries were largely and increasingly borne by local homeownere in 
communities across America. 

The schools became less selective during this period The disappear- 
ance of the entrance examination to the high school was one important 
sign of this development: Maintained by most of the public high schools in 
1900, the entrance examination had disappeared almost entirely by 1925. 
High school entrance examinations were incompatible with the mission of 
opening the doors to all who were interested in continuing their education. 
During diis period, there also developed a pattern of promotion frcm class 
to class for entire age groups that was relatively independent of the mastery 
of school subjects. The older pattern of promotion by merit was rejected as 
costly, inefficient, and out of harmony with the commitment to education 
growth (Ayres, 1909). 

School programs adapted to the new waves of students, introducing 
subject matter that was believed to meet student interests more than the 
«tablished programs of history, geograjrfiy, literature, classics (languages, 
literature, philosojrfiy), science, and mathematics. Vocational subjects 
entered the curriculum, along with a variety of other courses that were 
considered part of a general program and not as preparation for college. 
The Cardinal Principles of Secondary Education (National Education Asso- 
ciation of the United States, 1918) provided a rationale for the new voca- 
tionalism, just as the Report of the Committee of Ten (Sizer, 1964) had 
provided programmatic support for the uaditional curriculum. 
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A major new institution was created in this period, the comprehen- 
sive high school. Within its walls gathered students with very different pro- 
grams—some remaining four years, others dropping out earlier; some 
headed for the trades, others for coilege. They would meet in a common 
homeroom class before dispersing for very different and varied educational 
experiences. A major casualty of this new pattern of education was the core 
curriculum. Students were brought together for elements of a arnmion social 
experience, not a ccantunon academic program. 

Public ccMifidence in the effectiveness of the high schools was shaken 
by the knowledge that students who would formerly have failed high school 
entnmce examinations could now enter freely. It was also the case that 
testing of 1.7 million military recruits in World War I revealed a large 
number of near illiterates who had attended American high schools (Yerkes. 
1921; Brigham. 1923). In response, school principals and superintendents in 
the 70.000 or so school districts across the country made an inqjortant rffort 
after World War I through their professional associations and their individ- 
ual efforts within school systems to show that they were managing their 
expanding systems effidendy. Extolling their testing programs, they argued 
that scientific procedures were being used to place students in appropriate 
programs and that the effectiveness of the different instructional programs 
was being regularly assessed. The chosen instrument for this scientific assess- 
ment was the standardized objective multqjle-choice test (Resnick. 1982). 

In the period between 1912 and 1922. school testing bureaus were 
created in nine of the ten largest dty school districts in the United Sutes. 
and by 1925. there were sixty such bureaus across the country. These bureaus 
oidered. administered, and interpreted lests in their school districts. In 
response to a survey in 1925, they reported that the major use of aputude 
tests was to place students in homogeneously grouped classes (Bureau of 
Education, 1926; Deffcnbaugh, 1923, 1926). Achievement tests were used to 
assess the effectiveness of programs within individual schools and to com- 
pare the performance of different schools. 

The fact that results on achievement tests were published in local 
newspapers and that aptitude tests were widely used to defend dedsions 
about dassroom placement and educational guidance indicates two points 
of great importance. First, educators were very sensitive about their relations 
with parents and community leaders. They recognized the importance of 
remaining accountable for their conduct to the community of parents and 
taxpayers. Second, they found that decisions that could be supported by test 
results were generally assumed to be sound. Tests appeared to be impartial, 
objective, and sdentific For lay people, the results were difficult to contest 
Like the fkst expansion, the more than threefold increase in post- 
secondary undergraduate enrollments between 1954 and 1983 was driven in 
part by demographic factors and in part by the inaeased unportance 
assigned in the workplace and sodety at large to additional years of educa- 
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don. Not quite half of the increase can be attributed to the baby boom. 
The rest came from an increase in the portion of the youth cohort that 
attended college. As in die first expansion, America was die first Western 
nadon to offer so many years of educadon to her young people. The first 
expansion that we are examining here was aimed principally at those 
between the ages of fourteen and eighteen; the second, at those between 
eighteen and twenty^our. 

This second expansion brought changes in the struaiires of higher 
educadon, as the first expansion had brought changes in the structure of 
the high schools. One major change was the dramadc sevenfold growth in 
the number of community colleges: By 1983 about 1,450 two-year insdtu- 
dons were in place. As these insdtudons grew in number, their enrollments 
kept pace. More than 40 percent of the dose to ten million undergraduate 
students in 1983 were in two-year community colleges, as compared with 14 
percent in 1960. These students tended to be part-time, vocadonally oriented, 
and relatively unlikely to complete a four-year degree. 

As undergraduates sought their degrees in different kinds of insdtu- 
dons and as new kinds of students entered these struauies, the academic 
prcgrams that students pursued also changed character, even in the tradi- 
tional four-year institution. A core curriculiun in traditional subjects gave 
way to a variety of vocational offerings. Analysis of National Center for 
Education Statistics (1985) data on baccalaureate degrees awarded between 
1963 and 1983 indicates that the portion of students who majored in history, 
sodal science, literature, foreign languages, philosophy, math, and science 
declined precipitously, from about 40 percent to 20 percent of majors. At 
the same time, business majors almost doubled as a portion of baccalaureate 
i-edpients, receiving 23 percent of the degrees. 

Just as growth in public funding was critical for the secondary 
schools during their period of expansion, colleges and u liversities became 
more dependent on public funding during their expansion. The greatest 
single beneficiaries of enrollments in the second period of growth were the 
state and community colleges, which depended largely on state legislatures 
for support 

This second period of expansion has been a difficult one in which to 
maintain public confidence in institutions of higher learning. The nation 
is still ema-ging fi:om an intense period of criticism of its secondary institu- 
tions diat produced more than a dozen commission reports and an india- 
ment M a "rising tide of mediocrity." The public recognizes that the 
products of these secondary schools are entering higher education. How 
good are the institutions that receive these graduates? 

Even as confidence in the quality and effectiveness of institutions of 
higher learning has waned, the cost of schooling has risen more rapidly 
than the rate of inflation. And, unemployment and underemployment 
among young graduates has brought into question the ability of a college 
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degree to assure integration into the work force. At the same time, the nation 
faces demands for increased military appropriations and continuing support 
of domestic entitlement programs in a period of unsettling fiscal problems. 
These are difficult times in which to restore confidence in the quality of our 
institutions of higher learning. 

But, our colleges and universities must act to restore public confi- 
dence. The recruitment of students, federal and state subsidies, foundation 
support, and even research contracts depend on the implied and preliminary 
contract of confidence. Four kinds of action are likely. The first two employ 
the time-honored techniques of our market society and democratic political 
system. The last two invoke strategies associated with the movement for 
assessment in higher education. 

The first response can be described as marketing, directed through a 
variety of media to publics of parents and potential students. The second is 
lobbying, in which public colleges and universities, along with private insti- 
tutions seeking public support for research and other purposes, make their 
claims before legislators, departments of education, and other agencies. The 
third response, testing, calls on a form of assessment whose first educational 
uses were in primary and secondary schools. Standardized tests are now 
used in some colleges and universities to establish minimum oompeteacy 
for admission, promotion, or graduation. The expectation is that the scien- 
tific nature of the procedure will satisfy external demands for accountability. 
The fourth response is still emerging. It, too, belongs with the current assess- 
ment movement It calls on colleges and universities to devise their own 
evaluation instruments, appropriate tc Seir specific missions, student bod- 
ies, and academic programs. Although the primary clients for the resulting 
evaluations are the institution's administration, faculty, -md board of trus- 
tees, it is expected that these results, like those from competency testing, 
will also be communicated to a wider public 

Standardized testing was used from the early 1920s by primary and 
secondary schools mainly to develop public confidence in placement deci- 
sions and to assess programs. Secondary schools in a number of dty and 
state systems gave it a new use in the late 1960s and 1970s at a time of 
contest over the behavior, learning, and course programs of secondary school 
students. Tests that were standardized on a statewide basis were developed to 
serve as measures of hi:^h school exit-level competency and make the di- 
ploma a certification that the high school graduate had certain minimum 
skills in reading and math. More than two thirds of the states had imposed 
minimum competency tests by the mid 1980s (Resnick, 1980; Ericson, 1984). 

Colleges and universities had used standardized intelligence tests for 
admissions screening since the early 1920s, in some instances to impose 
quotas against minorities (Wechsler, 1977). In 1926, Carl Brigham intro- 
duced the Scholastic Aptitude Test (SAT) for the College Board The SAT 
drew on verbal and mathematical aptitudes in a multiple-choice mode; it 
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was not widely used until after World War II. But, in the past forty years it 
has become the most heavily used test of the College Board and, with the 
American College Test (ACT), the major entrance screening device used by 
institutions of higher education. During the same period, the multiple* 
choice mode was imposed for almost all subject matter testing of college 
applicants. The results of those tests were used for placement and to grant 
credit or exemption. 

Competency testing fn the high schools, which began in the late 
1960s, created an acceptability for system- and statewide efforts to certify 
minimum levels of ability in reading, writing, and math among students in 
public institutions. In the early and mid 1980s, demands for competency 
testing were extended to colleges and universities. Florida, New Jersey, and 
Tennessee led the way in imposing mandated competency testing progiams. 
Such testing was used to place students with low levels of verbal and math- 
ematical skills in remedial tracks, to monitor entry-level qualifications for 
students transferring from two-year to four-year colleges, to establish min- 
imum competencies for graduation from four-yeej public institutions, and 
in some instances to provide grounds for the reallocation of financial 
resources within a statewide university system. 

State legislatures demanded demonsuations of gains in achievements. 
They wanted to see gains in learning by students during their undergraduate 
years, and they wanted to see them measured by standardized tests. The 
public became accustomed to seeing standardized testing used as a measure 
of educational perionnance by institutions during the expansion of our 
secondary education system. They appreciated its scientific charaaer— objec- 
tivity in grading, reliability of results, effective use of technology, simplic- 
ity—results diat could be reduced to a single score; and its economy— low 
per-unit cost for each administration. They also liked the possibility of 
comparing the perfmnance of one group with the performance of popu- 
lations elsewhere. 

To measure achievement, the legislatures wanted achievement tests. 
Such tests could be i^ovided statewide for basic math and reading skills 
when the curriculum was adapted to teach what the tests measured But, 
unless a curriculum was aeated for the tests, it was impossible to expect the 
measures to measure achievement, even when they were labeled achievement 
tests. Given the variety and diversity of our institutions of higher learning, 
the variety of textboc4cs, and the different ways in which faculty had been 
trained, there was no residual common curriculum. This core had been 
fragmented in the colleges and universities, as it had earlier been fragmented 
in the high schools. Statewide achievement measures were possible for min- 
imum skills in specific areas where the tests actually prescribed the curricu- 
lum. It was not possible for other kinds of skills and knowledge. 

When broader .neasures of performance were sought, legislators and 
educators had to turn to aptitude tests. Aptitude measures, which used some 
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variant of the verbal and mathematical sections in the group intelligence 
tests introduced to elementary and secondary schools in the 1920s, had the 
great merit of not being tied to any specific curriculum. Indeed, they were 
respected in the 1920s because it was presumed that diey did not discriminate 
against those who had been exposed to courses of very different character 
and quality. They were justified by some as somehow equalizing the differ- 
ences between weak schools and strong ones and as allowing native abilities 
to triumph over poor environment 

In the 1920s, many psychologists believed that native aptitudes pre- 
dicted success in school, on the job, and in later life. Few share such beliefs 
in the 1980s. In place of a belief in the determining role in life of natural 
gifts and heredity, most Americans believe that hard work is the major 
determinant of success. Aptitude testing has been inherited from a period in 
which American elites shared different values. It has persisted for so long 
because we have not found other reliable prediaors of future performance 
that permit us to compare populations in our many and varied educational 
institutions. 

Reliance on sptitude tests in the 1980s is fraught with problems. 
Aptitude tests stil! permit national comparisons of performance by popula- 
tions with very different kinds of educational experience. And, to the degree 
that they measure knowledge and skills that are independent of what is 
taught and learned in specific courses and curriculums, they connrol for 
differences in school experience. Howevti, ^2rformance on such measures is 
strongly dependent on socioeconomic background, and it is far from culture- 
free. Such perfonnance privileges family backgroimd, not hard work. Few 
can now accept that this kind of assessment is equitable. 

Aptitude tests were not designed to measure college achievement To 
measure such achievement, we will need reliable measures of learning gains 
on available local cunicula. Such tests will have a classroom-based curricu- 
lar validity that nationally standardized achievement measures do not have. 
But, they are not likely to permit the kinds of comparisons of performance 
among institutions that nationally normed instruments make possible. Will 
it ever be possible to develop tests that have curricular validity and yet pro- 
vide bases for comparbons nationally? This is a challenge for test developers 
that requires them to pay equal attention to what is taught and to what is 
learned in college and university classromis. 

Key foundations, professional associations, and the Department of 
Education are leading the search for new ways of measuring learning gains 
in higher education. They are joined by a number of . nstitutions engaging 
in their own experimentation, sometimes collaboratively, with or without 
external support. The American Association for Higher Education, with 
support from the Fund for the Improvement of Post-Secondary Education 
(FIPSE), has become a clearinghouse for information about current projects. 

In his recent study of tensions in undergraduate institutions, Boyer 
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(1987) has underlined the importance of ongoing assessment in the bacca* 
laureate college. The Carnegie Foundation for die Advancement of Teach- 
ing is funding my own ongoing study of assessment issues in historical and 
policy perspective. Adelman (1986) [xx>vides a useful introduction to assess- 
ment issues. Bok (1986) makes the case for active involvement in assessment 
by aheady strong institutions. Bok has joined FIPSE in funding a three- 
year study of assessment in higher education led by Richard Light of the 
Kennedy School. Faculty and administrators from nearby Ivy League col- 
leges and universities have joined Harvard colleagues in working groups on 
a variety of assessment projects. The Association of American Colleges has 
received support from FIPSE for a three-year study of pilot projects that 
seek to strengthen academic programs in eighteen colleges and un^Vcreities. 

The effort to build public confidence in higher education will focus 
public attention on the curricula of institutions of higher learning, and it 
may help our colleges to rebuild appropriate cores of learning in harmony 
with their educational goals. However, this program of reconstruction is a 
long-term project In the short term, the response of the great majority of 
America's colleges and universities to a loss of confidence vsdll be more 
vigorous marketing efforts and increased lobbying for support from public 
bodies. At the same time, many institutions will have to show their account- 
ability to state legislatures on common competency tests, which are litde 
adapted to reveal the goals and strengths of different campuses. Only a 
small number of colleges and universities can be expeaed to lead the way in 
developing new measures of assessment diat are appropriate to the variety 
of our postsecondary institutions. 

Until there is more research, very little can be said about how stu- 
dents change and grow in the varied settings that have taken shape during 
the expansion of higher education. A small core of careful research and a 
number of personal intuitions complement shared experiences. What has 
been reported to date is not enough to dispel the current skepticism. When 
the research and reconstruction program of the next decade has produced 
results, the task of maintaining high levels of public funding for these insti- 
tutions may be easier than it is now. But, even with research that can show 
the value added by college, there will be no easy viaory. The excess capacity 
of our postsecondary institutions, the nation's economic and budgetary prob- 
lems, and the decline of public confidence in the preparation for college 
given by public secondary schools all suggest that the current problem of 
confidence is likely to persist for some time. 
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Community colleges will be asked to respond to calls for 
increased educational excellence while maintaining access to 
educational opportunity for studet^ts who are least prepared 
to succeed. Accountability-based assessment rather than 
compliance-based testing will be required to accomplish 
this task. 



The Other Side of Assessment 

Peter M. Hirsch 



From the earliest times, the focus of human thought has been to understand, 
explain, and predict the world in which we live. With the dawning of civi- 
lization, our ancestors* efforts began to transcend the banding that enabled 
them to survive a natural environment that was both hostile and dangerous. 
To overcome our physical limitations, we learned to live in groups and in 
ways that divided the labors of life into manageable and knowable tasks. If 
we were successful in placing the right persons in the right roles and if we 
were not overwhelmed by others who did a better job of assessment and 
placement, our societies survived 

As we learned to conox>l nature, our numbers grew, and our societies 
became larger and more complex. Role specialization increased, and we 
developed economic, political, religious, and social structures to create the 
order that was needed for the many to live together successfully. Gradually, 
the increasing complexity produced formalized systems for preparing per- 
sons to assume their roles. Knowledge acquired value, and schooling and 
education became necessary parts of the preparation. 

Today, American society faces even greater challenges in preparing 
individuals for successful participation. The information explosion, the 
enormous influx of immigrants and the new cultural diversity that they 
aeate, the transition within our economy from a national to an interna* 
tional base, the shift in employment opportunities from production to ser- 
vices, and the increased role of technology in our daily lives have made 
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advanced formal education essential in America— not for some but for all. 
The need for an effective and responsive education system has become so 
cnidal that several recent major national reports have addressed the question 
of how our education systems can be strengthened to meet the challenges 
facing our society. 

The Lure of Reform 

Report after report calk for reform of American education. The spiral 
of public education opportunity, which historically in this nation swirls 
between access and quality, has once again turned to increased expectations 
and heightened standards of student performance as the answer to the prob* 
lems of educating Americans. 

Of the recent reports, that of the Study Group on the Conditions of 
Excellrnce in American Higher Education (1984) has >>een most widely 
quote! Its position is quite dear, institutions should )ie accountable for 
stating their expectations and standards. The Commission for Educational 
Quality (1985) is even more emphatic In their view, the quality and mean- 
ing of undergraduate education has fallen to the point that mere access has 
lost much of its value. 

Each of us is susceptible to the lure of reform. It is a glamorous 
topic that has the face advantage of providing simple answers to complex 
questions. Yet, with an overburdened K-12 system and the documented 
underpreparedness not only of the new majority and the economically less 
well off but of the middle class as well, the problems of access and success, 
of standards and quality will be intimately interconnected as America's 
postsecondary education structures move into the twenty-first century. 

The Role of Community Collies 

There is no doubt that community colleges will be the first institu* 
tions withir. the postsecondary education tier to count a majority of minor- 
ities among their student bodies. In California, many elementary and 
secondary schools abeady emoU a majority of minoptics. For escample, 
more than eighty languages are spoken by studi'nts enrolled in the Los 
Angeles Unified School District And, oommunit^• colleges in California, 
such as Compton and Los Angeles Southwest, already count a vast majority 
of minorities among their students. Nor are these devebpments limited to 
central Los Angeles. In Alameda, Or^iif^e, and San Francisco counties, 
indeed across the state, commimity colleges are becoming the port of entry 
to higher education for increasing numben of the new majority and the 
traditional poor. In its draft report on California commimity college refonn, 
the Joint Committee for Review of the Master Plan for Higher Education 
(1986) estimated that, of thf roui^y 32 million persons expected to reside 
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in CaiilcMTiia, by the turn of the century, 52 percent of the school-age diil- 
dren 'mil be minorities, and within the first decade of the twenty-first cen- 
tury, the majesty of Califomians will rcpiesent minority populations. The 
implications of the demographic dau arc inescapable: California will be a 
new maj<mty state. The question is not whether but when. It is equally 
certain that other suies will see similar developments. 

In the larger domains of the economy and the quality of life, they 
are the community colleges that will serve the needs of a,c new majority 
and the traditional poor, adult learners and women returning to the class- 
room, and workers seeking the skills newly required for empkyyment The 
community colleges will enable these individuals and others to become 
fully participating members of our economic, poliucal, and societal fabric 

Indeed, community colleges are the central pivot point in a public 
education infrastructure designed to enable each person to realize his her 
individual potential, to achieve a quality of liie that nurtures family and 
community, and to participate successfully in the labor force. Only if these 
objectives are achieved lor all— fifth-week as well as fifth-generation— will 
America be able to retain its pre-eminence among nations and continue to 
compete effectively in the international marketplace. Community colleges 
will play a key role in accomplishing these objectives. Their ability to do so 
will be directly related to their ability' to danonstrate accountability in main- 
taining access while achieving the reforms that have been called for. 

The Question of Accountability 

Partly in response to the work of the Commission on Instruction 
(1984) of the California Association of Community Colleges, the state of 
California established a dtizen commission to review the state's master plan 
for higher education. In completing the fiist port of its review, the California 
State Commission for the Review of the Master Plan (1986, pp. 1-2) noted 
that, while the colleges had succeeded beyond all expectations in providing 
low-cost access, "access must be meaningful, and to be meaningful, it must 
be access to a quality system that helps ensure the success of every student 
who enrolls. The responsibility for this success falls on all who participate 
. There must be a commitment on all sides— from the state, from the 
colleges, and from the students— to excellence and accountability. It is to 
this end that we urge change." 

The emphasis on access, excellence, and accountability is neither 
new nor recent with icspcct to American higher education. What is new is 
the repeated statement, in all recent sute and national reporu, that access is 
meaningless without accountability. However, accountability is all too often 
eq^iatcd with compliance. This is especially true of the laws enacted by state 
legislatures and the Congress and of the regulations that state and federal 
officials develop to implement these laws. One cannot help but ask why? 
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Figure 1. Chafttceerisik Diffmnco kfween Compliance Systems 
and Accountability Systems 
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Figure Z Characteristics of America's Best-Run Companies 



• A bias for action 
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initiative 
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Sjurce: Prtm and Waienmn (1982). 
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The answer begins to emerge when we examine the differences 
between the logic of accountability systems and the logic of compliance 
systems. Figure 1 contrasts the characteristics of compliance systems and 
accountability systems. The hsts are not meant to be exhaustive but merely 
suggestive of the differences. 

The characteristics of accountability systems listed in Figure 1 are 
not unlike the characteristics of America's best-run companies that Peters 
and Waterman (1982) have identified Figure 2 lists the characteristics of 
America's best-run companies. 

li we compare the two lists, it seems dear that systems of account- 
ability and systems of excellence share the same fundamental characteristics: 
a bias for action and change based on processes diat allow for differences 
among participants; that tolerate failure and reward success; that promote 
autonomy, entrcpreneurship, and initiative; diat share information; and 
that seek objectives th^t are meaningful to those involved 

Comparison of the two lists also makes it dear diat the characteristics 
of compliance systems are in direct conflict wi ' the characteristics of Amer- 
ica's best-run companies. Where accountability systems seek and promote 
excellence, compliance systems develop and implement minimum standaixls. 
In short, where accountability systems engage individuals to do and be all 
that they can do and be, compliance systems demand that individuals do 
and be what the>- are told to do and be— no more and no less. 

Minimum Standanb, Testing, and Asscsanent 

In its report on transforming the state role in imdergraduate educa- 
tion, the Education Commission of the States (1986) advances eight chal- 
lenges fadng undergraduate education and makes twenty-two 
recommendations to state leaders for dealing with the challenges that it has 
identified The report is directed at hew states and state leaders can create a 
positive environment for institutional leaders in the hope it will contribute 
significandy to national discussions and to state action. 

The most significaru and unique feature of this document is the 
consistent use of accountability as the basis of argument and the concomi- 
tant emphasis on assessment rather than on testing: "The term assessment 
IS being used to refer to all sorts of activities, from testing basic skills of 
freshmen to certifying graduates' minimum competences, from evaluating 
academic programs to judging whole institutions . ^ . The terms testing 
and assessment often are used interchangeably, which further complicates 
an already complicated issue . . assessment has also become a major concern 
of state leaders. To date, the^' have been most concerned about enfordng 
minimum standards for student progress znd using standardized tests as 
tangible evidence that undergraduate education does make a difference . > 
But, testing is not synonymous with assessment, nor should it be , . Stan- 
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dardized tests have some pardcularly serious drawbacks" (Educadon Com- 
mission of die States, 1986, p. 4). 

The Educadon Commission of die States (1989) report dtes die fol- 
lowing as limitadons of tesdng and standardized tests: "To evaluate under- 
graduate educadon solely on die basis of minimum competence contradicts 
its very purposes. The outcomes must include knowledge, skills, and atti- 
tudes diat go far beyond basic skills" (p. 9). 'The standardized tests diat 
several states have used to assess system effectiveness were not designed for 
that purpose . . . Qualitative data must be considered as well as quantitative 
data" (p. 9). "The need to assess student and institutional performance in 
ways diat improve teaching and learning is not refleaed in current efforts" 
(p. 4). "Screening should not be confused widi assessment as a means of 
improving teaching and learning. To document pjerformance is not to 
im(»rove performance" (p. 9). 

In response to tfiese limitations, die panel makes a number of rec- 
ommendations. Collectively, die recommendations lay out a strategic plan 
for integrating assessment into die total process of evaluating student and 
institutional outcomes. The plan includes die establishment of "early assess- 
mer*" urograms to determine die readiness of high school students for col- 
lege work and to identify high-risk students and die help diat diey need in 
Older to stay in school and be successful; die development of special assess- 
ment programs, including guidance and counseling, for assessing die edu- 
cational needs bodi of returning and of new students, especially diose who 
might be classified as nontraditional; die use of multiple indicators of effec- 
tiveness (student demography, program diversity, adequacy of instiiictional 
learning resources, student preparation for college work, student participa- 
tion and completion rates, student satisfaction and placement, alumni and 
employer satisfaction, work force development, and overall student educa- 
tional attainment) to evaluate systemwide outcomes; and die encouragement 
of institutions to devetop dieir own indicators of effectiveness to reflect dieir 
distinctive undergraduate education mission, including student participation 
and completion rates, measures of student-faculty interaction, faculty con- 
tiibution to die improvement of undergraduate education, student perfor- 
mance widiin and among majors, writing samples, senior projects, student 
satisfaction and placement, alumni and employer satisfaction, and faculty 
development activities. 

Assessment and Accounubility 

Widiout assessment diere can be no accountability. At die same time, 
widiout accountability die states and dieir colleges cannot know whedier 
assessment programs and services are achieving intended purposes. However, 
die implementation of accountable assessment programs requires deliberate 
actions at bodi state and college levels. 
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Ai the stale level, the governor, the legislature, and the governing 
boards must first concur on the purposes of assessment Without this essen- 
tial agreement, it will not be possible for the colleges to demonstrate account- 
ability in meeting expectations for outcomes. Second, the breadth and depth 
of the services needed to achieve these identified purposes must be estab- 
lished, and what the colleges will be asked to provide must be dearly under- 
stood Unless this is done, the colleges will not be able to implement 
appropriate programs of service and referral, nor will they be able to com- 
municate the information to the state that justifies the allocation of funds. 
Third, outcomes ejqsectations must be dearly defined for the assessment 
programs and services that the colleges provide, and these expectations must 
be consistent both with the funding that is provided and with the purposes 
that have been agreed on for assessment Fourth, accountability criteria 
must be devdoped to provide the structure necessary for implementing assess- 
ment programs and services. Colleges are thus free to achieve desired out- 
comes in ways best suited to the populations that they serve. Minimum 
standards, which by their very nature can do no more than provide a floor 
for the delivery of programs and services, are exduded in favor of systems of 
review that look at the performance of the colleges in meeting the criteria. 
Fifth, funding must be provided at a level that makes it possible to do the 
job that needs to be done. Colleges must be authorized to provide a variety 
of structures throu^ which assessment programs can be delivered, and they 
must be funded suffidendy to provide sudi alternatives. Where appropriate 
staffing to implement state-level assessment purposes is lacking, additional 
hmding must be allocated for staff development of existing personnel and 
for the recruitment of additional staff. Sixth, the state education code must 
support the purposes and outcomes that have been agreed on. Existing 
sections of the education code that are compliance based or that restrict the 
colleges' freedom to structure their assessment programs in the best interests 
of the students and communities that they serve must be replaced with code 
sections that base the evaluation of program success on accountability. 

At the college level, boards of trustees, administrations, faculties, and 
staffs must first establish an institutional dimate in which assessment is 
viewed as a broadly based instructional and student planning and evaluation 
process. In general, accountable assessment programs are integrated into the 
total educational program; they are viewed as part and parcel of a single 
purpose. Second, a bioad-based student assessment program must become 
an integral part of the delivery of instruction at all levels authorized— reme- 
dial, de-zelopmental, and college-level. At a minimum, the assessment pro- 
gram must include aptitude, career, skills, and self-concept assessment 
instruments and techniques of suffident variety to ensure that the full range 
of students who are likely to enroll can be assessed. Where appropriate 
college capabilities are lacking, students must be referred to assessment pro- 
grams external to the college. Third, students* success expectandes must be 
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based on locally normed assessment scores in relation to remedial and devel- 
opmental program components and roUege-level courses. Student demogra- 
phic information must be taken into account Use of a single standardized 
test must be avoided, as must reliance solely on standardized tests. Standard- 
ized tests are notorious for their lack of cohort reliability both between 
cohorts in a given time frame and across time frames for a given cohort In 
addition, the rang^ of the different combinations of corrert and incorrect 
answen to questions can produce like scaes on standardized assessment 
instruments. Hence, all students who score die same on the same subpart of 
a given standardized test do not have the same skills strengths and weak- 
nesses. Writing samples and similar college-based assessment tools in math- 
ematics and oral oommunications must be used as snpplements to 
standardized tests. Fourth, evaluation and student follow-up must become 
an integral part of the design of the assessment program. Such evaluations 
and follow-up must examine the effectiveness of the various program com- 
ponents in order to ascertain which assessment instruments predict what 
program results for which groups of students under what circumstances 
and conditions. Fifth, assessment information must be used to make curric- 
ulum decisions that accommodate students' differing learning styles. Sixdi, 
college support to ensure the success of the assessment program must be 
made available through funding and staff development opportunities that 
prepare administiBtors^ counselors, faculty, and support staff both to imple- 
ment and to evaluate assessment services. This support must be enhanced 
throu^ the development and implementation of policies and procedures 
that are supportive of student access to and success in education programs 
of substance and hig^i quality at every level of instruction. Without the 
basic institutional support that these factors represent, the desired outcomes 
of the college's assessment program are likely to remain objectives. 

The Other Side of Aacssmcnt 

In short, the Education Commission of the States (1986) recommen- 
dations prescribe state-level and collegewide agreement on die purposes, 
levels of service, and expected outcomes of assessment programs; fimding 
sufficient to allow the accomplishment of goals and objectives; flexibility to 
meet local differences in student needs as determined by demographics; and 
supportive s^ate education code and college policy and procedure language 
that emphasizes the accomplishment of results, not program structuring 
and service delivery. 

Qearly, the Education Conunission of the States panel views assess- 
ment as a broadly based system to ascertain student readiness for college 
work; to provide students, counselors, instructors, and odiers with the infor- 
mation necessary for ensuring student success; to allow individual colleges 
to know for whom and how they have been effective; and to enable state 
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education systems to gauge the extent to which students are being served 
and state priorities are being met Qecvly, this effort goes beyond student 
performance testing and screening and beyond minimum standards. 

But, just as dearly, even the best and most comprehensive assessment 
program will ultimatdy be constrained bom accomplishing its objectives if 
it results in a denial of access. This is not simply a matter of individual 
educational opportunity. In a world where the leading edge of technology 
changes daily, the future of this nation and its citizens depends on the ability 
of our education systems to prepare each and every one of us to participate 
effectively. 

This, then, is the other side of assessment* It is the capability of our 
colleges to be accountable for the purposes for which programs of assessment 
are conducted It is the capability of our colleges to enable student success 
while maintaining access to meaningful educational opportunity for a citi- 
zenry characterized by an increasing diversity of culture and skills readiness 
to participate c!fectivdy in the American educational structure. It is the capa- 
bility of our colleges lo demonstrate their effectiveness under conditions of 
underfunding and the often different educational objectives of states, their 
public colleges, and the citizens who enroll. It is ultimately, more than any- 
thing else, the capability of our colleges to meet each person on his or her 
terms, to assess his or her individual educational necus, career and life goals, 
and objectives and to be in a position to provide programs of education that 
are appropriate and rdevant to those needs, goals, and objectives. 

And so we come full circle. The andents labored to control the envi- 
ronment so as to better ensure their futures. As they devdoped knowledge, 
they turned to magic to bring powers ihey did not have to their aid through 
procedures that ensured outcomes. In short, they endeavor^ to make the 
unknovm predictable. Today, we labor under similar drcumstances— to 
conU*oI the educational process so as to better ensure the futures of our 
students. In many ways, education is like magic: It is a process that, when 
done correctly, produces desired outcomes. Our task and challenge is to 
make the results of what we do in assessment knowable and known, to 
make educational outcomes predictable. 
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Perhaps it is time to shift the focus of our attention from 
statewide mandated testing to classroom testing, surely 
a neglected area on most campuses. 



Assessment and Improvement 
in Education 

John Losak 



Tesung has taken on new dimensions as a pan of higher education in the 
U.S. since several states began to legislate standards across the board for all 
students, not just those in specific professions (for example, law. nursing). 
At both the point of entry and the point of exit, testing programs have had 
an impaa diat is likely to increase in the near future, not to abate. Yet. by 
and large, classroom testing has been left untouched. One of the hidden 
factors driving the strong movement for minimal exit competencies is diat 
classroom testing practices have not assured that students do indeed have 
basic skills. 

A major assunption of both exit and entry-level testing is that any 
judgments diat are anived at can be sounder and perhaps even wiser if there 
are objective and standardized measures of achievement that can be reviewed 
There is no question that the judgments will be arrived at with or without 
an exhaustive testing program. Rather, the question is whether those judg- 
ments can be improved by the use of a testing program. I believe that use of 
a standardized testing program either for course placement or for exit exam- 
inauons can positively influence the judgments that are needed at these two 
points. Although knowledge of a student's high school curriculum is useful 
for imtial course placement decisions, it is well known that the same subject 
IS not taught with the same level of rigor or ex, .nation in all high schools. 
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Therefore, a common placement examination helps the adviser or other 
decision maker who works with the student to effect a more appropriate 
placement than could be achieved if the student's achievement on the high 
school curriculum were the only basis for the decision making. The same 
analogy holds for decisions regarding the award of the associate degree to 
students who have progressed through two years of a college cuniculum. 
Common testing has a way of assuring that common learning has occurred 
and of assuring the publir and the legislators who represent the public that 
die goals, values, and objectives diat have been deemed important and appro- 
priate are in fact demonstrably achieved in an objective manner. 

Is there then a direct link between the efiort to improve die quality 
of education and die ir" lation of a program of standardized testing? A 
direct cause-and-effea relationship is quite difficult to establish. We in Flor- 
ida have found diat diere are important spinoff effects that encourage die 
use of common examinations to make placement decisions and to assure 
minimal exit competencies. At Miami-Dade Community CoUege, we have 
identified such spinoff effects as improved faculty morale, strong student 
support, and strong community support All diese effects reflea an increas- 
ing! positive attitude toward higher education. Moreover, diere is evidence 
diat student learning is affected by die level of expectations diat instructors 
and odiers have of students and diat, as diese levels of expectadons are 
raised on common examinadons, student performance often follows. 

It should also be said diat die imposidon of a standardized tesdng 
program on a shaky infrastructure probably does no more than reflect die 
weakness of die infrastructure. If die purpose of examinadon is to provide 
guidance on die screngdi or weakness of die curriculum, die tesdng program 
may be useful. However, die tesdng program will not in itself improve die 
quality of a poor infrastructure, aldiough it may provide some guidance on 
die reforms diat are needed in order for die curriculum and student learning 
to improve. 

In summary, standardized tesdng for entry-level course placement 
decisions and exit examinadons can be effective in assuring diat certain 
basic concepts have been learned and diat students who need remedial efforts 
receive die remedial courses. Moreover, diere is evidence diat die initiation 
of such a testing program conveys a message of positive educational value 
to many constituencies in higher education, including students, faculty, 
and lay citizens. We do well to remember diat one of die real dangers of 
testing is to imply diat all low-scoring students should be denied entrance 
to college. Studies diat we at Miami-Dade Community College have con- 
ducted suggest diat a student who is academically underprepared at entrance 
is not incapable of learning. u • _j 

The exit test administered to sophomores in Florida can be ated as 
an example of state intervention in die examination process. The College- 
Level Aodemic Skills Test (CLAST) required by die state of Florida for an 
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associate in arts degree consists of a series of tests designed to measure the 
communication and computation skills that community college and state 
umversity faculty members expect students who complete the sophomore 
year in college to possess. 

In spring 1979, the Florida legislature enacted a law requiring iden- 
uficauon of basic skiUs. In August 1979, the office that directs the progiam 
at the state level was established. During the nact two years, these skills 
were identified, and item specifications were de\eloped. The fii^t test was 
givoi in fall 1^, and passing standards were first required in fall 1984. 

It is difficult to estimate the overall cost in dollars of the CLAST to 
the state of Florida. At Miami-Dade Community College, we estimate that 
the direct costs are ctose to |7 per student The state awanls a contract to the 
office of instructional resources at the Univenity of Florida, and the cost per 
student at the state level is approximately |13. If a 25 percent indirect cost is 
added to the local cost and the state cost, the |25 per-student cost multiplied 
by the 34,722 students tested in the 1985-86 academic year means that the 
total cost was $868,050. 

One of the primary impacts of the intervention of state legislators in 
the assessment of students has been the dear message to faculty in the state 
of Florida that their past evaluations of students have not been satisfactory, 
rhe requir2ment that students demonstrate minimal scores before they are 
awarded an associate in arts degree continued to influence the award of 
grades by faculty. Test scores have risen during the four years in which the 
examination has been administered We must be cautious in interpreting 
the higher scores, because there are at least three plausible explanations: 
The students who are taking the examination have gotten better, efforts to 
miprove the cuniculum have been successful, or wide dissemination of infor- 
mation about the forni and content of die examination has made the stu- 
dents testwise. Anothw visible impact is that the number of associate in an-i 
graduates has dropped. At Miami-Dade Community College, associate in 
arts graduates have been reduced by 40 petoent 

CLAST is in place in the state of Florida essentially because the 
public had lost faith in the assessment p,rocess used by instructors in their 
classrooms to arrive at grades. Why is it that students who received the 
associate degree and who functioned at a C level or bette- in the classroom 
could not read, write, or compute at a high school le-^el on the CLAST? 
The reason is that most instructors evaluate on a normative basis, and die 
talent that is before diem decides die norm. In addition, few instructors 
have either die training or die indinauon for die role of measurement and 
evaluadon. A grade of C in an introductory psychology course at Swarth- 
more does not reflect die same mastery of content diat die C grade does at a 
two-year open-door college. One important component of die issue of grade 
infladon is die fact diat many instructors would have to award a very high 
propordon of F grades if die same expectadons for content mastery were to 
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be de-.nanded u: every institution, open-door comm nity college as well as 
sdect liberal arts tollege. . . 

As for the issue that the instructor must also be an evaluator, it is 
dear that American higher education does not prepare its graduates in dis- 
cipline areas for the lole of assessor. Some critics have argued that a master's 
degree or a Ph.D. in chemistry, history, geography, or English has not pre- 
pared the graduate either to instruct or to evaluate. I will focus here only on 
the fact that the instructor m ist spend between one quarter and one third of 
her or his time on measurement I include in my estimate the time spent 
conceptualiring, devdoping, scoring, returning, and interpreting the mate- 
rials to students. In all likelihood, few instructors in the disciplines just 
mentioned have had even a single course in measurement, much less 
advanced courses in assessment Sariven (1982) offers a thorough and severe 
critique on this issue 

If I am right, u .e most viable solution is to weaken the link between 
the teaching and evaluation roles expected ot :r-..tructors. This is not a new 
idea. As O'Neill (1987, p. 2) has noted, as early as 1869, Charles Eliot, the 
president of Harvard University, "called for an external examimng body 
that would be distinct from the teaching body in the granting of degrees." 
At die University of Florida as recenUy as twenty years ago, university exam- 
iners prepared the tests for students in their first two years, and instructors 
had virtually no role in evaluation. This system was modeled after the sys- 
tem that Robert Hutchins had put into place at the University of Chicago. 

In my opinion, the extreme dependence of our evaluation system on 
faculty judpnent makes it an anachronism, and it should either be over- 
hauled or discarded. Seventy-five or a hundred years ago, we could afford 
instructors' ineptness in assessment both because most students were highly 
selected and motivatsd to begin with and because classes were usually quite 
small, which increased the opportunity for the personal interaction that 
permits an instructor to make a relatively informed judgment about a stu- 
dent without having any real knowledge of assessment In contrast, today's 
supermarket system of education, in whidi classes are very large, requires a 
different plan for the evaluation of student learning. Either faculty must 
become a great deal more sophisticated and rigorous in their system of 
evaluation, or evaluation by units external to the classroom will increase. 
Computer-assisted assessment may well be the technology that makes an 
inaeasingly rigorous and sophisticated student evaluation feasible. The 
institution where it is most important to separate teaching from evaluation 
activities is the two-year open-door college. However, because large numbers 
of the students who enroll in classes at any college are underprepared, the 
question of the extent to which teaching and evaluation can appropriately 
be made more separate than they cuuenUy are is germane to all institutions 
of highe: education. 

Finally, if the role of the instructor as evaluator decreases, will stan- 
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daids be imposed from without? It is precisely the inability of those within 
higher education to solve the assessment issue that leads legislative bodies to 
impose standards and procedures. Increasing our reliance on common exam- 
mauons written by local discipline expcrts-that is, deparunenta! and even 
baccalaureaic-levcl examinations, which some colleges still provide-will 
serve to provide benchmarks; relieve the instruaor from timeKwnsuming, 
frustrating, and often onerous tasks; and permit the instructor to focus on 
the teaching function. It should also provide a more realistic basis for the 
appraisal of student learning. 

Perhaps it h time to shift the focus of our attention away from state- 
wide mandated tcsong to classroom testing, as I have suggested here. It is in 
the classroom that student learning is most directly assessed, and it is in the 
classroom that thougjit and energy should be devoted to our attempts to 
improve higher education dirough assessment 
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Gains in learning are expecud of college students. This chapter 
reviews the pros and cons of value-added assessment and 
proposes several alternative approaches. 



Value-Added Assessment- 
College Education 
and Student Growth 

Marcia J. Belcher 



Higher education is under fire. Officials in the federal government warn (rf 
closer scrutiny. Sute legislators move to assess the impaa of sute dollars on 
higher education. Major groups have issued reports that decry the quality oi 
undergraduate education and urge reforms. At the heart of these matters are 
the questions of what is excellence in higher education and how it can best 
be attained 

Astin (1985) argues that the traditional views of excellence, which are 
tied to repuution (translated as selectivity and size) and resources (also tied 
to repuution), do not really either measure or promote excellence in higher 
education. To replace them, Astin proposes an approach that emphasizes 
educauonal impact or value added, since "true excellence resides in the 
ability of the college or luiiversity to affect its studenu favorably, to enhance 
their intellectual dewlopment, and to make a positive difference in their 
lives" (Astin, 1984, p. 27). 

The value-added approach emphasized by Astin focuses on changes 
m students between the beginning and the end of their college careers. As 
Tumbull (1987, p. 3) has noted, "the root idea of assessing how much stu- 
dents learn or mprovr or grow in school or in college, as well as how they 
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stand at graduation, is not only a good and important idea but obviously 
one that lies near the heart of die education enierimsc." 

It is an idea dat is gaining momentum. Sutc cocwdinaung boards in 
Tennessee and South Dakota require vahie-added testing and several other 
sutes, including Colorado, Maryland, New Jersey, and Virginia, arc consid- 
ering the approadt An increasing number ot individual institutions have 
implemented value-added initiatives, Tlie best-known example is N<»theast 
Missouri State University, which has used such a system since 1974. Its 
apfmach includes using standardiied tests for beshmen and sophomores, 
major field examinations for graduating students, and attitude surveys of 
students and alumni. 

Arguments for and Against Value-Added Assessment 

In the current debates over value-added assessment, three majcw issues 
keep emerging. One issue focuses on growth and on whether this is the best 
way of conceptualizing excellence in higgler education. The second issue is 
how the installation of value-added assessment will change die institution. 
The diird issue is whetf« die value-added measurement meUKxi can capture 

the learning process in hi^icr education. 

Value-Added Assessment Emphasixes Growth. Should growdi or 
competence be die standaid used to judge excellence? To base our judgment 
of an institution on die quality of its graduates ignores die skills and abili- 
ties widi whidi its graduates anrived. A selective college can be confident 
diat its graduates will be successful, since its students have been related on 
diese very same measures. Including die inputs could change die institutions 
that are considered occellent 

Astin (1984) argues dut value-added assessment promotes die goal of 
educational equity, since it places die emphasis on improvement Students 
are not denied opportunities because diey perfonn at a low level on entry. 
Gains or imiMtwemcnts are die focal point, and institutions and individuals 
alike have an opportunity to be excellent under diis approach. 

For odiers, improvement is an insufficient basis for die making of 
judgments. These people argue for bottom-line ("minimal") standards that 
all must meet and discount die issue of improvement. While Manning 
(1987, p. 52) agrees diat value-added assessment is a good mediod for evalu- 
ating instructional programs, he worries diat the "truly deceptive aspect of 
die value-added philosophy lies in die effort of some of its proponents to tie 
student assessment too narrowly to die notion of improvement rather than 
to criteria of competency./* Most proponents of value-added assessment 
hasten to note diat measuring improvement docs not replace die possibility 
of setting a floor by exit standards. 

Exit standards are often diought of as involving assessment at die 
time when a student is ready to receive a degree. Catamaro (1987) points to 
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the diversity of students widun the community college system and to dieir 
broad spectrum of goals. He argues that many attend community colleges 
specifically because they want a value-added education (that is, specific skills 
or competencies), not a set of competencies tied to the completion of a 
degree. 

Even if all were to agree that it is impcmant to measure improve- 
ment, it can be difficult to do so. Measurement specialists have wtesUed for 
ycari with ways of measuring and comparing gains. Anodier problem lies 
in linking growth to instruction. As Warren (1984) notes, it may be that 
snidents are of sudi high ability that they will learn a great deal, whatever 
the quality of the instruction that is provided Also, high entering skill 
levels that provide little room for growth may limit the amount of change 
that is seen. 

Another measurement issue involves the question of whether the 
same students arc being measured at the begirming and at die end Looking 
at the average irKiease in a measure taken at entrance and graduation may 
say more about the retention policy of the institution than it does about the 
quality of the educatkm that the institution provides (TumbuU, 1987). If 
the only students who are left are the students who entered scoring high, 
then improvement is automatically showrt 

Value-Added Astessmeni mU Change Ae Way m Which htsiUu- 
Horn Opefoie. Critics of value-added assessment fear that value-added tes > g 
on a statewide basis will lead to a uniform curriculum and hamper individ- 
uality. Teachers may feel forced to emphasize skills assessed by the test to 
the detriment of other subject areas. 

Astin and Ewell (1985) argue that colleges and universities are in 
the business of developing student learning. A value-added perspective 
asks (acuity to sute objectives for the curriculum and to think in develop- 
mental terms. If the result is that faculty become more explicit about what 
should be uught to all students and more attentive to whether learning 
occurs, then a uniform curriculum is a benefit, not a drawback. The pro- 
cess would help to focus instituuonal auention directly on the teaching- 
learning process. 

Vahie^ddedAssessnteni Makes Assumptions About What Learning 
Is. Can value-added assessment capture die process of learning? Arguing 
diat learning in higher education involves a reconfiguring of patterns. Man- 
ning (1987, p. 52) concludes that "a valkl measure of initial status in a 
subjea matter may be inapf»opriate to measure performance at a higher 
level of learning." Tumbull (1987, p. 4) agrees, stating that it is "the patterns 
and interrelations among the indicators that count" Warren (1984) follows 
a different line of reasoning to reach a similar condusiort He argues that 
an effective pretest for a course assesses the prerequisite krK>wlege needed for 
die course but that this knowledge is not the knowledge or capability needed 
at the end of the course. Nevertheless, using a different test at die end of die 



40 



S4 

coune would make it impossible to con^)are scores. Warren believes that 
the same argument holds true when we try to compare institutions. 

Astin and Ewell (1985) reply that in many areas knowledge is cumu- 
lative, hierarchical, and measurable along a continuum. Therefore, knowl- 
edge is amenable to value-added assessment Even critics of value-added 
assessment ccMicede that it can be useful yAxen it comes to knowing more 
about generic competencies, such as writing and oriucal thinking. 

Wanen (1984) believes thai much value-added measurement is trivial 
and dtes pre- and posttesting of course content as an example. He argues 
that performance at the end of a course is an acceptable indicator of the 
effects of the course. Astin and Ewell (1985) reply that value-added assess- 
ment in courses is only one component of the value added and that the 
implementation of value-added assessment has not trivialized discussions of 
learning outcomes at institutions where it has been tried 

Although critics of value-added assessment have been assured that it 
does not need to be confined to the use of a standardized test, the impression 
continues. For example, Tumbull (1987, p. 5) urges that a variety of assess- 
ment techniques be used to measure student progress, adding that "the idea 
that a test is going to give you more than a fraction of what you are inter- 
ested in learning about progress toward tbe broad goals of higher education 
is, at this date, totally illusory.'* 

Alternative Methods of Measuring the Vahic Added 

Though value-added assessment has traditionally been thought of as 
pre- and posttesting, that approach is not the only way in which value- 
added assessment can be implemented According to Tumbull (1C87), both 
progress and the end product arc important in assessirig the value of educa- 
tion. Assessing improvement is most useful when we compare the effective- 
ness of institutions or programs from year to year. He suggests preserving a 
set of senior theses as benchmarks for varying levels of acceptability and 
recoiding the proportion of the senior class that meeti the various bench- 
marks. The benchmarks can be saved and used to compare individual insti- 
tutions with one another as well. 

The beauty of the approach just described is that it allows the evalu- 
ation to be more holistic than it can be m standardized testing. However, 
the approach has several drawbacks, including deciding on what will be 
assessed (for example, creativity, grammar, critical thinking, logical presen- 
tation of ideas) and on how to assess it reliably. 

If standardized tests and placement tests are used and if improvement 
in writing and math skills is the issue (as it is in many community colleges), 
then a second and perhaps supplemental process might be employed to 
assess the value added I propx)se a four-step process whereby the institution 
would administer an enuy-level test in basic skills aud use rhe resulting 
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scores lo place students in their initial level of coursework; decide which 
curricular variables should be related to the level of basic skills measured at 
the point when the student graduates and collea information on these skills 
for each student; select a test of basic skills to be given at the point of grad- 
uation (it can be die test used at entry, or it can be a more difficult test on 
the same content area); and conduct a yearly analysis (using a statistical 
techiiique, such as multiple regression) to assess the extent to which the 
entering level of basic skills and the curricular variables predia the exit 
level of basic skills. 

Such a process could answer the question about the relative contri- 
butions of entering skills and the curriculum. Because the analysis would 
account for the possibility of shifting levels of basic skills, the changing 
contributions of the curriculum across the years could be assessed 

Results from the type of analysis just described showed that the cur- 
riculum at Miami-Dade Community CoUege played a large role in predic- 
tions of exit skills in computation for A,A. graduates but that reading skills 
still depended heavily on the level of reading ability that students brought 
to college (Belcher. 1986). In computation, the entering level of basic skills 
was less predictive for black students than it v/as for odier groups. No dif- 
ferences were found in communication. Figure 1 depicts the results for com- 
munication, and Figure 2 depicts the results for computation. 

The analysis just describfxl used the Comparative Guidance and 
Placement Program (CGP) tests in reading, writing, and computation to 
m^me entry-level skills. The four subtests of the College-Level Academic 
Skills Test (CLAST)— reading, writing, computation, and a holistically 
scored essay— were used to measure exit-level skills. The curricular variables 
wen* grades in two English courses and one math course and the number of 
credits earned in developmental English, math, and English-as-a-second- 
language courses. The amount of time that had elapsed since the students 
completed their major English and math courses was included to account 
for the forgetting that can take place over time. Belcher (1986) provides 
further details on the study. 

This approach to value-added assessment has some statistical and 
conceptual problems. For example, it assumes both that the curriculum can 
be defined and that the effects are cumulative and linear. The relationship 
between the curriculum and the exit level of skills depends in part on the 
strength of the relationship between entry and exit skill*, in the instance just 
described, the exact amount of change in skill level could not be assessed 

However, the inherent relativity of this approach can also be viewed 
as a su-engih. The question. vVhat is the value of a college education? must 
be countered by the question. Compared to what? While the ultimate answer 
might compare the skill development of college graduates with the skill 
development of students who do not graduate (since students can continue 
to mature whether they are in college or not), this approach assumes diat. 
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Figure L Ckintribution of Basic Skills at Entry and Curriculum 
in Predicting Communication Skills at Esdt 
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without the college curriculum, students who enter with the highest level of 
basic skills will exit with the highest levek and that those who enter at the 
bottom will exit at the bottom. If the curriculum helped to maintain these 
rankings, then differoices due only to the curriculum would not be seen. It 
could also be argued that the impact of curriculum could be unidirectional; 
diat is, curriculum affects only those at the bottom, not those at the top. 
Therefore, improvement would be demonstrated statistically, but important 
differences would be masked by this level of analysis. 

Conclusion 

Value-added assessment is one of several solutions currendy being 
offered as tools for remediating the weaknesses of higher education. It will 
be some time before sufficient evidence is available to judge the effectiveness 
of this approach and to determine whether proponents or critics were correa 
in their evaluations. Legislators and the general public need an approach 
that is both valid and simple. If value-added assessment is implemented 
without regard to the information needs of administrators, faculty, and 
students or to the unique character of the institution, it will probably fail. If 
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Figure 2. Contnbution of Basic Skills at Entry and Curriculum 
in Predicting Compuution Skills at Exit 
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it is implemented thoughtfully with the full partidpaiion of all interested 
parties and with multiple measures and approaches, it may succeed in pro- 
viding focus to the real goal of higher education— teaching and learning— 
and in bringing lasting and benefidal change to higher education. 
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Teacher-made tests are more than assessment devices: They are 
a fundamental part of the educational process. They can define 
instructional purposes, influence what students study, and help 
instructors to gain perspective on their courses. How well the 
tests accomplish these purposes is a function of their quality. 



The Role of 

the Teacher-Made Test 

m Higher Education 

Scarvia B. Anderson 



Let us examine two myths. Myth one: Students study because they want to 
learn. A few students study because of their intrinsic interest in the subject 
matter— accounting, personality theory, the English novel. But, most under- 
graduates study only as much as they have to— to get by and to get through, 
to retain dieir scholarships or to maintain their athletic eligibility, to keep 
their families or their employers off their backs. Myth two: Colleges and 
universities have a profound influence on students' ability and motivation 
to learn. There are a few notable exceptions, but by and large the more 
knowledgeable and able students in high schcx>l are also the more knowl- 
edgeable and able students in college. Furtheimore, the students who are 
more knowledgeable and able to start with are tlie students who are likely 
to profit from instructio i. Thus, when colleges are compared on the basis 
of output, the variance between institutions can be attributed more to the 
characteristics of the students whom the institutions admit than it can to 
the programs offered The value-added approach to institutional evaluation 
keepsi selective colleges from taking credit \^^ere it is not due, but any com- 
parisons between the value added by different colleges must take into 
account the caliber of the students that each college had to work with. 

D. Bray, and H J Bekhcr (etk). Istm m Student AimnenL 
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The editors of the New York Times headlined an article I wrote 
about classroom tests 'Teste That Stand the Test of Time" (Anderson, 1985). 
After it appeared, I received many letters from college professors, school 
administrators, and others saying that it was about time someone had some- 
thing to say about something other than standardized teste. But, one writer 
took me firmly to task for denigrating standardized teste. That was not my 
point at all; the two kinds of teste serve quite different purposes. I empha- 
sized that standardized teste, the ones that get all the publicity, frequendy 
have something to do with who gete certain educauonal opjx)rtuniues, 
while teacher-made teste, the silent majority that you do not hear much 
about, are the teste that determine what education is. 

Long before standardized testing became a mulumillion-dollar busi- 
ness, studente at every educational level took the local teste and examinauons 
that determined whether they got an A or a C, passed the course, accumu- 
lated enough credite to receive a degree, or received a favorable recommen- 
dation from the instnic^or. Such teste have three fundamental educational 
properties: First, more than any other educational device, teacher-made teste 
tell studente what the purpose of the instruction is and what is expected of 
them. If the English professor asks only one question on Moby Dick and it 
is. What different kinds of whales did diey encounter on thdr voyage? he 
has certainly given studente an inadequate reason for studying this great 
novel. Second, what studente study is what they think they are going to be 
asked about in the instructor's teste. The first myth was that studente study 
for the joy of learning. The student below the graduate level who does is 
rare indeed, and some professors complain that many graduate studente are 
not self-motivated. There is no jx)int in Xeroxing supplementary reading 
liste if studente are not queried on the contente of the readings. Third, the 
preparation of good tests helps instructors to gain perspective on their 
courses and sometimes even to understand better what they are teaching. 
Paul Diederich, a distinguished English teacher and scholar, was once asked 
if he understood Eliot's Four Quartets. He scratched his head and said, "I 
don't know. I've never tried to write an exorcise on it" 

Knowing that teste and examinations define instructional purposes 
and instructors' expectations, profoundly influence what studente study, 
and help insuoictors to gain perspective on their courses places considerable 
responsibility on those who make up the teste. People who develop staii- 
dardized teste for commercial establishmente have the luxury of plying their 
trade full-time. College professors have to fit test making into a schedule 
that includes a great many other things: preparation and delivery of courses, 
committee or administrative assignmente, student advising, research, and so 
on. Ir is no wonder that many of the teste that are made up hurriedly on the 
wa> class, that are kept in the files of student clubs, or that are stored in 
the miaocomputers that departmente are so proud of are not very good 
teste. They do not focus on what is most important, they do not inspire 
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students to study what is worth studying, and they do not present an intel- 
lectual challenge to the examinees, not to mention the examiner.. 

There arc basically two functions that educational tests should assess: 
knowledge and skills. Knowledge, which includes understanding and infer- 
ence as well as infomiation, can be measured both by good essay questions 
and by short-answer, multiple-choice, and other objective types of items. 
Even the much-maligned true-false questions can be used if the task is in 
fact to identify the truth or falsity of propositions. For example, these seem 
to be legitimate true-false items (Ebel, 1965, p. 139): 

A receiver in bankmptcy acquires title to the bankrupt's 
property. X p 

More heat energy is required to warm a gallon of cool water 
from 50 degrees F to 80 degrees F than to heat a pint of the 
same cool water to boiling point T F 

The shortcut of statements taken verbatim from the textbook neither puts 
the true-false item to good use nor produces a good test 

Of all the objective types of items, the multiple-choice form is prob- 
ably the most generally useful, and, conu^ry to popular opinion, multiple- 
choice items can be used to measure a diversity of cognitive processes. For 
example, consider these items: 

The concept of the plasma membrane as a simple sievelike structure 
is inadequate to explain the 

a. passage of gases involved in respiration into and out of the cell. 

b. passage of simple organic molecules, such as glucose, into the 
cell. 

c. failure of protein molecules to pass through the membrane. 

d ability of the cell to admit selectively some inorganic ions while 
excluding others. 

To select the correct answer (d), the student must know that the living 
plasma membrane has properties in addition to those served by the thin 
films usually used in laboratory demonstrations of osmosis (Educauonal 
Testing Service, 1963). 

Thick with towns and hamlets studded, and with streams 
and vapors gray, 

Like a shield embossed with silver, round and vast the landscape lay. 

At my feet the city slumbered. From its chimneys, here and there 
Wreaths of snow-white smoke, ascending, vanished ghost-like 
into air 
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The poei most likely to have written these lines is 

a. Stephen Vincent Benet 

b. Emily Dickinson 

c Henry Wadsworth Longfellow 
d Edgar Allan Poe 
e. Walt Whitman 



Note that in this item the student is not asked or expected to recognize the 
lines from memory. Instead, he or she is expected to identify them with the 
style of one of the poets (Longfellow) or, conversely, to reject them as unlike 
the style of any of the other four. 

It is far from easy to write good multiple-choice items. Even the best 
item writers are frequendy frustrated in their attempts to invent a plausible 
but incorrea fourth or fifth choice, and some materials do not lend them- 
selves to a fixed number of choices. 

Harold Gulliksen, the well-known measurement theorist, advocates 
a type of item that combines multiple choice with matching. These items 
are easier and quicker to construct than either of the parent types, and they 
are quite well suited to certain kinos of content Each exercise presents a 
small number of responses and a large number of "statements" (terms, 
phrases, quotations, and so on), and students use each response several 
times. For example, in current history, you might list five relig ons and ask 
students to characterize each of fifteen nations in terms of the leiigion of the 
majority: 



Religion of Majority- a. Catholic; b. Hindu; c. Moslem; d. Pro- 
testant; e. Other 



— 1. Argentina 

2. Canada 

3. Costa Rica 

— 4. France 

— 5. India 



6. Japan 

7.. Malaysia 

8. Pakistan 

9. Philippines 

10. Republic of Ireland 



11. U.S.S.R. 

12. U.K. 

13. Uruguay 

14. U.S. 

15. Yemen 



You can see the possibilities of this type of item, which is sometimes called 
a key-list exercise, for genres or periods in literature, types of government, 
classes of compounds in chemistry, concepts in business law, and so on. 

By definition, college professors profess on many topics, and many 
of them profess tc demise objective tests. If they admit using them, it is only 
out of practical necessity with their largest classes. However, I hope to have 
shown diat objective tests can do a rather nice job of measurement in many 
instances and that a set of good objective questions is sup)erior to a set of 
bad essay questions. By bad I mean questions like diese: 

43 

ERIC 



4S 



Discuss the causes of the Qvil War. 

What is the greatest social achievement of the twentieth century? 

The responses to such questions are almost impossible to grade fairly. The 
best grades usually go to the more verbal students, not to the students who 
know more about the subject matter. Of course, instructors who write good 
essay questions have clear grading rubrics in mind from the outset 

There are circumstances in which instructors must ask students to 
write their answers— for example, when they want to know how well they 
can write, whether they wish only to observe the students' mastery of simple 
mechanical conventbns or their ability to express complex ideas or write 
creatively., 

As I indicated earlier, there are two things that college tests should 
assess, knowledge and skills, and the reason is simple; Knowledge and skills 
Tre what most college courses are all about Up to this point (widi the 
exception of the issue of writing tests), I have focused on the measurement 
of knowledge. To measure skills, you usually need to ask students to do 
something: 

Make a scale drawing of a public building. 
Speak extemporaneously on a popular topic. 
Write a letter of application for a job. 
Prepare a souffle. 

Write a proposal for an experiment 

Analyze a blood sample. 

Transpose a piece of music into another key.. 

Edit a techuicai fnanuscript 

Write a computer program. 

It is seldom suffidait to ask studenu about dr.>wing, spc?iking, writing, 
cooking, and so on, ?lthoucjh there is usually soiDt* basic knowledge impor- 
tant to the development of such skills that can be measured ,eparately. 

The guidelines for the consmiction of good performai^ce tests do not 
differ from the- guidelines for the construction of good p?per-and-pencil 
tests ot knov, ledge: First, specify the criteria to be used for rating or scoring 
the performance or product Second, state the problem so that students are 
absolutely clear about what they are supposed to do. Third, if possible, tell 
students the basis on which their performance will be judged Fourth, avoid 
any irrelevant difficulties in the content o; procedures of testing. For exam- 
ple, do not require students to work through an elaborate set of written 
instructions in order to demonsuate that they can carry out routine compu- 
tations. Fifth, if possible, give the students a chance to perform the task 
more than once or to perform several task samples. 
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Most colleges and universities make an attempt to judge the teaching 
proficiency of faculty members. While rcmy of diese attempts are informal, 
some departments are seeking more systematic ways of rating teaching 
proficiency in terms of such variables as course content and organization, 
classroom techniques, encouragement of students to think creatively, and 
evaluation practices. Review of some of the instructor's tests is essential in 
order to rate him or her on evaluation practices. However, review of the 
faculty member's tests and examinations may also shed light on other vari- 
ables. For example, if examination questions are limited to textbook exam- 
ples, there is litde evidence that the faculty member encourages students to 
think creatively. Thus, the examinations that are used to evaluate students 
may also figure in the evaluation of teaching proficiency. 

Those who develop and administer aptitude, basic skills, IQ, and 
other standardized tests are constandy being called on to defend the use of 
such tests. The »ests discriminate against some segment of the population, 
the tests are "coachable," the tests exert an unhealthy influence on the cm- 
riculum— these are just some of the charges. But, how many college teachers 
have ever had to defend the fact of course examinations and quizzes? Stu- 
dents expea them, administrators expect them, regents expect them. What 
college tcach*^ shouki be called on to defend ts the quality of the tests that 
they give and the influence that the tests exert on student learning. 
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The use of direct writing assessment on a large scale seems to 
be growing. This chapter reviews the process of developing 
a writing assessment program. 



Assessment of Writing Skills 
Through Essay Tests 

Linda Crocker 



The essay is the oldest form of written examination. Dubois (1970) has 
documented its use in Chinese dvil service tests as long ago as 2200 B.C 
Written essay examinations were used in medieval European universities. In 
the nineteenth century. Francis Gallon (1948) used the marks assigned by 
Cambridge University examiners to an eigtit-day essay examination to dem- 
onstrate tha; achievement test scores for large samples followed an approxi- 
mately normal distribution. Even the first British dvil service examinations 
were entirely essay in format In the United Slates, the essay item was the 
predominant form used in college admissions testing until the 1920s, when 
the more easily and more objectively scored multiple-choice item became 
popular (Breland. 1983). 

While widespread use of items requiring written responses has waned 
in the measurement of many academic subjects, essay testing continues to 
play a dominant role in the measurement of writing ability. Thus, the mea- 
surement literature distinguishes between the notions of essay compositions 
and essay test items. In essay subjea area examinations, knowledge of a 
spedfic academic subject, such as history or biological sdence. is assessed 
The exanunee*s writing ability is usually oonsidered to be peripheral to the 
characteristic of rnajor interest In the ossay composition, the examinee's 
writing ability is the trait being assessed The written essay represents a 

0 Bny. and M. J. Bricha (eck). /istfs m Siudint AsatSMmerU. 
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performance sample that allows for direct assessment of the examinee's writ- 
ing ability.. The focus of this chapter is on the use of the essay for direa 
assessment of writing ability. 

Why Should Essay Examinatiofis Be Used? 

The use of essay items to test examinees' knowledge of the rules of 
grammar, knowledge of the mechanics of writing, or spelling ability is not 
generally recommended. These skills can be tested more effidendy with 
objective test item formats. Nevertheless, the essay is still widely used to test 
ability to organize information, express ideas, generate original thought or 
solutions, communicate with cxjwcssic i, or demonstrate stylistic aspects of 
writing. The essay fcmat has some well loiown limitations, including the 
time-consuming scoring process and the subjectivity involved in the evalua- 
tion of examinee's responses. Despite these problems, the credibility that 
essay items have with instructors, administrators, examinees, and the public 
at large (Reniz, 1984) is a strong argument for their continued use. In this 
same vein, Diedcrich (1974, p. 1) pointed out the logical appeal of collecting 
writing samples wh<n we *^nt to draw inferences about students' writing 
abilities: "Whenever we want to find out whether young people can swim, 
we have them jump into a pool and swim." 

Today, the use of direct writing assessment on a large scale seems to 
be growing. Direa writing assessment is included in the National Assess- 
ment of Educational Progress, the English composition test administered as 
part of the College Board's admissions testing program, the Test of English 
as a Foreign Language (TOEFL), statewide assessment progianis for public 
school students, and most recently statewide assessment prc^ms at the 
college and university level. A prominent example of the last type of pro- 
gram is the state of Ftorida's College-Level Academic Skills Test (CLAST). 

The purpose of the writing assessment programs just named are 
quite diverse. They range from differentiating among examinees for selec- 
tion, to certification of minimal competency skills, to identification of indi- 
vidual strengths or weaknesses for instructional placement or remediation. 
Thus, the first step in the development of ? writing assessment must be to 
identify the primary purpose to be served by the data that will be colleaed 
Adhering to the goals of the assessment is essential in subsequent decisions 
about how to structure the writing assessment program. 

Once the objectives to be sampled by die writing tasks have been 
specified, the process of instituting a large-scale testing program for the 
direct assessment of writing typically involves a series of steps, such as those 
ouUined by Meredith and Williams (1984) or Quellmalz (1984b). These steps 
include the development and field-testing of a large pool of suitable topics 
or prompts, the development of scoring procedures, the selection and train- 
ing of scorers, the administration of the examination, the scoring of the 
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resulting writing samples, and the assessment of the reliability and validity 
of the examinees* scores. These steps will be considered in the remainder dl 
this chapter.. 

Developing Prompts 

An important consideration in large-scale writing assessmmt is the 
development of a sizable pool ol topics or prompts that can be us !d to 
generate the examinees' written responses. Unlike objective tests, which can 
be kept secure after development and reused many times, new lopia must 
be available each time the writing examination is administered, because 
examinees can remember the essay topia and pass them on to cohorts who 
will take the lest at a later sitting. In creaung multiple prompts, the task is 
to ensure that the topia are different enough to offer no advantage to those 
who write at later sittings yet similar enough to maintain comparability in 
terms of the skills assessed and the level of difficulty. 

In assessmenu of basic writing skills, the prompt typically specifies 
the topic, the audience to whom the writing is to he addressed, a suggested 
structure for the response, and the mode ot discourse (Quellmalz, 1984b; 
Meredith and Williams, 1984). Mode of discourse (or aim of writing) is 
illustrated by the five categories suggested by the National Coundt of 
Teachers of English: narrating, explaining, describing, reporting, and per- 
suading (Tate and others, 1979). In writing assessment programs in higher 
education, the intended audience and the mode ot discourse are sometimes 
implied rather than cxpliddy staled in the prompt 

Most authorities recommend that an essay prompt should have seven 
characteristics: First, the topic should be a thought provoking stimulus that 
gives the examinee some latitude for self-expression. Second, the topic 
should be specific eix>ugh to ensure some conunon theme or core of content 
in the responses of ccaminees, although their viewpoints may vary. Third, 
the prompt should provide a structure for the examinee's response. This 
suiicture can often be achieved by suggesting that the examinee use exam- 
ples, give an opinion and supporting reasons, or address both sides of an 
issue. Fourth, the content of the topic should be within the general experi- 
ence of all examinees. For example, an item that asks examinees to describe 
their position on a particular recent event may leave some examinees at a 
disadvantage because they are uninformed in this area. Fifth, the topics 
should not afford an advantage to rxarrinees of a particular gender, radal 
or cultural group, or socioeconomic class. For example, a topic related to 
sports can be viewed as biased agamst females. Even such a topic as "My 
Most Memorable Summer Vacation'* may leave some examinees with litde 
to write if they have never had an opportunity to take a summer vacation. 
Sixth, the topic shouL' ,oid controversial political or sodal issues. Asking 
examinees to state their positions on abortion or use of illegal drugs may 
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introduce an unwanted bias into the scoring process, since some raters might 
find it difficuh to evaluate objectively papers that expressed positions drasti- 
cally at odds with their own personal beliefs. Seventh, expectations for the 
length of the essay and scoring criteria should be expliddy stated. Time 
limits should also be specified. 

One fairly controversial issue in the devebpment of writing prompts 
that must be addressed is whether to provide examinees with a choice 
among several topics or to require all examinees to write on the sa»r*c topic. 
The proponents of multiple topics argue that examinees usually perceive 
this practice as fairer and that it may be a way of avoiding undesirable 
cultural bias in die selection of topics. The critics of providing a choice of 
topics dte the difficulty of ensuring that die topics are equal in difficulty 
and the possibility that examinees who unwitdngly choose thft more diffi- 
cult topic may earn bwer scores (Hoedcer, 1982; Rosenbaum, 1985). Anodier 
problem is that examinees who begin to write on one topic and then change 
their minds lose valuable time. At present, no single posidon is universally 
accepted in large-scale wridng assessment programs for secondary school 
and college students, but Dovell and Buhr (1986) point out diat die literature 
on die reliability of essay scores generally advocates requiring all examinees 
to write on the same topic or topics. 

After the prompts are written, they are typically reviewed by a panel 
of experts who check to see that diey are consistent widi die purpose of the 
assessment program. The experts may also evaluate other qualifies of the 
prompts, such as those mendoned earlier. Technical aspects of the prompts, 
such as grammar, readability, length, and die quality of any artwork, should 
also be reviewed 

Developing Scoring Procedures 

The three most commonly used scoring procedures in large-scale 
wridng assessments are holisuc scoring, analydc scoring, and primary trait 
scoring. The term holistic scoring refers to die pracdce of having a rater 
read the essay and make an overall judgment about its quality. Typically, a 
number from a continuum is assigned as die outcome of oiis scoring pro- 
cess. The rater is usually provided widi seme verba! descripdon of the qual- 
ifies diat should be considered in assigning rafings. The rater may also be 
provided with criteria for assigning each separate numeric value. Sample 
responses that typify each category in the scoring confinuum are somefimes 
provided as reference points. 

The terms analytic scoring refers to die pracdce of having the rater 
evaluate each essay with a sp)ecific list of features or points in mind and 
assign a separate sccM-e for each point. The total score assigned to the 
response is the sum of the scores for the sp)ecific features. The best-knovm 
example of an analydc score guide for essay composifions is probably Diede- 
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rich's (1974) scale, v*iich requires the $a»ing of ideas, organization, word- 
ing, flavor, usage, punctuation, spelling, and handwriting. The rating guide 
for hinctional writing used in the Illinois writing assessment program rates 
examinees' essays on a six-point scale for focus, support, organization, and 
mechanics (Chapman, Fyans, and Kerins, 1984). 

The term primary trait scoring refers to procedures developed for use 
in scoring the writing samples collected as part of the National Ai:sessment 
of Educational Progress (NAEP) (Uoyd-Jones, 1977). Primary trait scoring 
is based on the assumption that the puipose of assessment is to determine 
the abihty of examinees to perform fairly specific types of writing tasks. In 
the context of a task involving writing a letter to persuade a reluctant land- 
lord to allow the writer to keep a puppy, Mullis (1984) describes the four 
fcoring categories for the evaluation of the resulting writing as follows: 
' Generally a '1' paper would present little or no evidence, a '2' would have 
few or inappropriate reasons, a '3' would be well thought out with several 
appropriate reasons, and a '4' would be well organized with reasons sup- 
ported by compelhng details." In conttast to holistic and analytic scoring, 
prim^ trait scoring uses scoring criteria that vary with the task assigned 

Training Raters 

Mullis (1984) has described the procedures used by the Educational 
Testing Service for scoring the English composition test and the NAEP 
writing exercises. In general, a set of anchor papers diat a panel of expert or 
master .a i!rs has scored are selected to represent each scoring eatery. Train- 
ing includes the discussion of scoring guidelines and the particular features 
of each category, illustrations using the anchor or standard papers. Meredith 
and Williams (1984) advocate the use of papers that represent both solid 
and borderline examples of the scoring categories. During training, raters 
receive feedback on the extent to vhich their ratings match those of the 
experts. 

After training, raters must demonsuate their expertise by successfully 
rating a set of qualifying papers that a panel of experts or master scorers 
has already rated It is necessary to establish a criterion for satisfactory per- 
formance on this qualifying task in advance. Sachse (1984) reports that 
trainees in the Texas nng assessment program must match master scor- 
ers' ratings on at least /£> percent of two sets of qualifying papers before 
they can serve as scorers. 

Field-Testing the Prompts and Scoring System 

After review, the pranpts are field-tested by administering them to a 
sample of respondents on an experimental basis. Responses obtained in the 
field tests are scored Sachse (1984) suggests that the field test responses 
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should be examined for possible miscues in die prompt, die possibility of 
reader boredom, die ease widi which rhe scoring guides can be applied, die 
appeal to examinees, and die level of difficulty. Topics must be equal in 
difficulty if examinees are to be given a choice of topics in the actual writing 
situation or if examinees must score above a fixed performance standard 
and different topics are to be used on different testing occasions. The most 
common practice for estimating the difficulty level of a prompt is to com- 
pute the mean score of the responses to it (Dovell and Buhr, 1986). It is also 
desirable to examine the variance of the distribution of the responses to the 
prompts that have been field-tested Rosenbaum (1985) describes some tech- 
nical approaches to the scaling of topics for difficulty.. 

From the field test it is also possible to estimate die time required to 
score a typical essay and hence to estimate the number of raters who will be 
needed, the amount of time required to complete the scoring, and the cost 
of scoring. It is also possible to identify any additional issues that may need 
to be riddressed in the d:aining of raters- 
Scoring the Writing Samples 

When a large-scale vmdng assessment has produced thousands of 
essays, such details as the physical setting for die raters' workplace and die 
logistics of arranging the essays into packets and distributing them to raters 
become crucial. One common practice is to assign raters to small groupw 
presided over by a table leader who is responsible for supervising die scoring 
process within that group. In addition, there are usually one or more chief 
raters who are available as resource persrns to answer questions that may 
arise. Ideally, each rater should reconi scores on a separate sheet that odier 
raters will not see. 

Typically, each essay is read by two or more raters, and the scores 
that they award are combined by summing or averaging in order to deter- 
mine die examinee's final score. A critical part of most scoring processes is 
how to deal widi die cases when die scores assigned to an essay do not 
agree. In a minimum competency testing situation, adjudication of such 
cases is necessary only when the discrepant ratings fall on opposite sides of 
the pass cut score. In norm-referenced writing assessments, adjudication can 
be invoked when die discrepancies exceed a certain range of points. Breland 
(1983) notes that one fairly common procedure for adjudication is to have 
anodier reader (for example, the table leader or chief reader) score the essays 
that have received discrepant ratings. 

Once the actual scoring process is under way, a common practice is 
to add some blind, prescored standard papers to die responses so that die 
accuracy of the scorers can be monitored and drift in scoring standards can 
be controlled. Frequent practice calibration sessions should also be conduc- 
ted during the scoring process to maintain rater consistency.. For example, 
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Meredith and Williams (1984) describe a process in which each day's scoring 
session begins widi a recalibration round using a standard set of five to ten 
papers. 

Assessing Reliability 

When the term reliability is applied to test scores, it generally means 
the degree of consistency in relative scores earned by a given group of ex- 
aminees over replicated testing situations. In large-scale writing assessments, 
where different packets of papers must be graded by different scorers, the 
issue of reliability usually centers on whether different raters would assign 
similar ratings to the same composition. Another question is whether the 
performance of examinees is consistent over different topics within the same 
mode and over dffferent modes of writing. As noted earlier, one important 
step in the planning of laige-scale writing assessment is to conduct field 
tests of the prompts and scoring procedures. The data from these field tests 
should be colleaed within the framework of a research design that allows 
these reliability issues to be investigated After the assessment system is in 
place, ongoing monitoring of the reliability of the scoring process should 
be part of the assessment program. 

A variety of approaches can be used to demonstrate the degree of 
reliability in the scores assigned to writing samples. Three are commonly 
used: indexes of decision consistency, such as the proportion of examinees 
consistendy classified into pass/fail categories or the proportion of exam- 
inees consistendy classified into all categories used in the scoring system; 
correlations of the soores assigned by all possible pairs of raters or correla- 
tions of the scores obtained from tlie same individuals on different writing 
sample5^ and variance components and generalizability coefficients obtained 
by applying analysis of variance. From a technical standpoint, the analysis 
of variance procedures, which are based on generalizability theoiy, are gener- 
ally recommended by measurement experts (Coffman, 1971; Meredith and 
Williams, 1984). There are two main reasons for this recommendation: The 
approach can be applied for any number of raters, and it makes it possible 
to estimate how many different sources of variance (for example, raters, 
tasks, occasions, time limits, instructions to raters or examinees/ affea the 
scores oi a set of essays, &ocker and Algina (1986) show how generalizability 
theorj' can be used in various single-facet designs where multiple raters rate 
essays. Llabre (1978) provides a detailed illusttation of the application of 
generalizability theory to writing assessment, using raters, modes of writing, 
and occasions as sources of variation. 

It is important for the method that is used to estimate reliability to 
reflea the way in which the scores for the writing samples are to be used in 
decision making. Thus, the procedure used to derive the examinees' scores 
should be taken into account in the estimation of reliability. For example, 
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if examinees' scores are derived by summing or averaging the scores of mul- 
tiple raters, the appropriate generalizability coefficient is estimated diifer- 
ently than it is if the score of a single rater is used. 

Assessing Validity 

Four different approaches have been used to estimate the validity of 
the ratings obtained from writing assessments. Breland (1985) offered a com- 
prehensive review of validation studies of essay tests for college and second- 
ary school students. Concurrent criterion-related validations have used such 
criteria as high schodi class rank, high school grade point average, English 
grades in college courses, cumulative college grade point average, and 
instructors' ratings of students' writing abiUties. The range of the validity 
coefficients for sixteen studies conducted between 1954 and 1983 was .05-.43. 
Predictive validity coefficients, which used such criteria as grades in college 
English courses, semester grade point averages, and essay posttest scores, 
ranged fix)m .21-.57., Breland's review further revealed that increments to 
validity were relatively modest when essay tests were used in conjuction 
with objective test scores and other predictors. However, Quellmalz (1984a) 
has suggested ^hat the criteria used in such validation studies may be inade- 
quate to represent the usefulness of direct writing assessments. 

When writing assessments are used to assess instractional effective- 
ness or mastery of basic skills . is appropriate to consider the content 
validity of the writing assessment tasks. Quellmalz (1984a) advocated using 
the same procedures for assessing the content validity of object^-'es and item 
specifications and the content validity of writing tasks and rating scales. 

The concept of construct validity is apprcf»riate to considerations of 
the issues of what trait or traits are measured by the writing tasks and scor- 
ing system. Several different types of studies seem relevant in the consuoict 
validation of writing tests. One approach is to examine whedier holistic 
scores and analytic scores are a function of a common underlying trait. 
Chapman, Fyans, and Kerins (1984) have reported a construct validation of 
this type that used factor analysis. Breland (1983) noted that a central issue 
is whether direct and indirect n\easures of writing measure the same trait. 
The study conducted by Quellmalz, Capell, and Chou (1982) illustrates a 
third type of construct validation for writing tests. These researchers used 
confirmatory factor analysis to investigate whether different traits can be 
measured by different direct writing tasks. Finally, a thorough construct 
validation of a writing assessment should probably establish die extent to 
which essay scores are free from extraneous influences of variables that may 
be present in this situation. For example, handwriting has often been dem- 
onstrated to influence raters' judgments of essay quality (Chase, 1968, 1986). 
Context effec • —that is, the effect of die quality of other essays read prior to 
the essay in question— have also been shown to affect essay scores (Daly and 
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Dickson-Markman, 1982; Hughes and Keeling, 1984). A thorough construct 
validation plan would include identification of extraneous variables and 
study of their impact on the scoring of writing samples. 

Conclusion 

Given the cost, the problems of establishing reliability and validity, 
and the time required to develop a sound writing assessment program, uni- 
versity and college educators and administrators may well ask, Is it worth 
it? In response, advocates of writing assessment point to the profound effects 
that such tests have had on secondary and college curricula and on class- 
room instructional practices. For example, Rentz (1984, p. 4) has described 
the impact of the inclusion of a writing test in the regents' testing program 
of the Georgia university system: "When die test was first administered in 
1972, some colleges were abandoning freshman English composition as a 
requirement Five years later, all colleges in the state required two composi- 
tion courses, and about half these schools required three. Furthermore, the 
content of these courses consisted of writing, writing, writing. Instructional 
personnel were hired because they could teach writing. Faculty in other 
subjea areas began to require writing. . . v It might be hard to solve some 
of the measurement problems, but direct assessment of writing by using a 
writing sample has credibility. The yield will be well worth the investment" 
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Because the proficiencies of entering students have declined 
over the past twenty years, the need for placement testing 
has increased greatly. This chapter discusses the factors to be 
considered in developing assessment and placement programs: 
which students should be tested, how testing should be carried 
out, which tests should be used, and how tests should be 
interpreted. 



A Primer 

on Placement Testing 

Edward A. Morante 



The term placement testing is used in higher education to describe a process 
of student assessment, the results of which are used to help to place entering 
college students in app opriate beginning courses. While such a process has 
existed at many colleges for years, the proficiencies of entering students 
have declined over *uie pasi twenty years, and both the need for and the use 
of placement tests has inaeased markedly. This chapter discusses which 
students should tested, when placement testing should be carried out, 
and the variables that are important in selecting a placement test, and it 
suggests a process for using tests in placement. It also discusses the compet- 
ing claims of standardizid and in-house tests, the issues of statewide testing, 
and the rationale for placement testing. 

Who Should Be Tested? 

Who should be tested? The answer seems simple: All entering stu- 
dents who need or ' ^ho would be helped by a course or by a level of a 
course outside the regular college-level program. English and mathematics 
are required at virtually every college, even in most certificate programs, but 
we cannot assume that all students enter college at the same level of profi- 
ciency in these subjects. A placement test or a battery of tests is essential in 
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detennining which courses or which levels of courses are most appropriate 
to individual students. Used in conjunction with other background infor- 
mation, test scores are essential in appropriate course placement. Individu- 
alized course placement is an essential step in retaining students. 

Why Can Admissions Tests Not Be Used? 

Admissions tests, like the Scholastic Aptitude Test (SAT) or the Amer- 
ican College Test (ACT), are inappropriate for placement when used in 
isolation. They can be helpful in a comprehensive placement process if the 
results are considered in conjunction with icores on a placement lest as well 
as other background information, but by themselves they provide insuffi- 
cient and sometimes misleading information for placement 

The SAT and the ACT are designed to select among the brighter, 
more competent college applicants. While these tests differentiate among 
the better students, the task of a placement test is to differentiate among the 
less proficient students. The items on an admissions test and the items on a 
placement test are selected for these separate purposes. The time constraints 
are also different As noted later in this chapter, placement tests should be 
unspeeded so that students can demonstrate how much they know, not how 
fast they can perfonn. The designers of admissions tests are interested in 
knowing both the level of a student's proficiency and the speed with which 
the student can demonsu^te that proficiency, because the combination of 
knowledge and quirJcnecs is important in predicting success in college. 
Admissions tests are thus more closely aligned with aptitude tests, which 
assess how capable a prospective student is of learning. Placement tests 
should be used to measure proficiency, not aptitude or capability, and they 
should not be used to predict future success. 

The SAT and the ACT are inappropriate as sole placement devices 
also because they do not accurately measure proficiency in basic skills. In 
New Jersey, for example, the Basic Skills Council compared SAT results 
with the results of the New Jersey College Basic Skills Placement Test 
(NJCBSPT). The council found that many students with above-average 
SAT scores were still not proficient enough in basic skills to be ready for 
college- level courses. The conclusion of this analysis, which was first carried 
out in 1978 and then repeated in 1986, was that a placement test was needed 
for accurate placement even for students who performed above le national 
average on the SAT; 

Why Can High School Grades Not Be Used? 

High school grades can and should be used in making placement 
decisions, but only in conjunction with a placement test. High school 
grades, the type and number of courses taken in high school, grade point 
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average, and rank in class are all helpful variables in making placement 
decisions. However, theu are two reasons why none of these indicators, 
used alone or in combination, is sufficient First, many students (the so- 
called nontraditional students) have been away from high school for a 
number of years. Their high school performance may not accurately mea- 
sure their current profidendes. This issue appears to be especially important 
for mathematics, which many students seem to forget if they do not use it 
regularly. 

Second, high school transcripts can be difficult to interpret, and they 
are sometimes even contradiaory. Different schools, programs, teachers, and 
courses provide little continuity, which is necessary for understanding and 
measuring the profidencies of students. While the fact that one student 
lacks certain courses may indicate that the student's profidency in that area 
is apt to be low, the fact that another student has successfully completed 
what appear to be aH>ropriate high school courses in the area is no guaran- 
tee of the student's profidency. This is trae even for recent high school 
graduates of a college preparatory curriculum. For example, the New Jersey 
Basic Skills Coundl (1986) found that only 2.5 percent of the recent high 
school graduates who had successfully completed a college preparatory math- 
ematics curriculum were profident in elementary algebra and that fully 50 
percent of the students could not successfully answer even half of the algebra 
problems on the test where the most difficult question was of the forn 
ax = c - 6jc, solve for x. Indeed, 36 percent of the same students could not 
successfully answer nineteen o[ the thirty questions on an arithmetic test 
that measures profidency in fractions, dedmals, and percents. It is beyond 
the scope of this chapter to explain these results. Let it suffice to say that it 
is risky to rely on high school performance as a measure of profidency in 
the making of placement decisions. Thus, the use of a test specifically 
designed for placement is essential. 

In-House and Standardized Tests 

The development of basic skills placement tests by local faculty is 
widespread. The resulting tests are generally referred to as in-house tests. 
While the writing of an essay topic or of mathematics problems appears to 
be relatively easy, most faculty seem to agree that the development of a 
reading test or a multiple-choice writing test lies beyond the capabilities of 
most local groups. 

This consensus masks a deeper problem. While the writing of items 
or questions appears to be relatively simple for some, especially for tliose 
who have taught for many years, the writing of good, unambiguous items 
that discriminate well among students of different groups, that are unbiased, 
and that relate well to the total test score is much more complex than it 
appears to be on the surface. In addition, the combination of items to form 
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a comprehensive test that is both reliable and valid is very difficult to accom- 
plish without a process of pretesting, statistical analysis, and objective, pro- 
fessional review. In addition, the development of alternate forms, which is 
important for retesting and posttesting. requires a level of sophisticated 
psychometrics that most faculty do not have or do not use in developing an 
in-house test 

The biggest complaint that faculty make against standardized tests 
seems to be that such tests do not measure what they want students to know 
or that the tests do not measure what faculty teach. However, the same 
complai.a could be made against standardiied tests, depending both on 
which test was selected and on what was taught in the curriculum. In- 
house tests can be written to reflea a selected currioalum. but they may not 
provide accurate measurement Faculty and administrators need to review 
the advantages and disadvantages of these two types of tests for the purpose 
of placement 

Selecting a Placement Test 

The selection of an appropriate placement test is one of the most 
important faaors in a comprehensive developmental education program. 
The placement test and the cut scores that are used cannot be differentiated 
from the standards of quality set by the college. Nine factors should be 
considered in any decision about a particular placement test, including an 
in-house test: the test's content, referencing, ^discrimination, speededness. 
reliability, validity, and cost; its control for guessing; and the availability of 
alternate forms. 

Content is the most critical variable in decisions about the quality of 
placement tests. The test or test battery should include reading, writing, and 
mathematics. It can address other areas as well, depending on the needs of 
individual programs or institutions. The reading component should be 
realistic and holistic. The topics or passages should cover a rarige of subject 
matter. Comprehension, understanding, and inferential reasoning are essen- 
tial. The vocabulary should be in context Standards should be set no lower 
than the equivalent of eleventh grade. 

The writing component should have both an essay and a multiple- 
choice section. The essay should be expository and require the student to 
demonstrate reasoning and organizational skills (for example, take a posi- 
tion and defend it with examples) as well as mastery of the mechanics of 
English (grammar, syntax, punauation. spelling, and capitalization). The 
multiple-choice section should assess the student's understanding of English 
in context, not merely the student's ability to identify the mechanics of 
English in isolation. Standards should be set no lower than the equivalent 
of eleventh grade. 

Arithmetic (computation) and elementary algebra are essential in the 
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machmaua component Higher levels may be appropriate. The arithmetic 
qu«uons should involve both problem solving and woid problems and 
m^ike use of fracuons. decimals, and percentages. Estimation problems are 
«s«itial for measuring the understanding of concepts. The algebra items 
should consist both of problems and of vvoid problems and at the minimum 
include linear equations involving numeral, fractional, and literal compo- 
nents. Assessment of vocabulary is not important 

A good placement test is criterion referenced. That is. levels of diffi- 
culty and profiaency should be established by faculty judgments of what 
students should know, not by norm-referenced procedures based on the skills 
that students bring at entry. 

A good placement test has discriminatory power. That is. it can dif- 
fcrenuate accurately among students along a continuum of proficiency 
Discnminauon is essaitial in decisions about the need for remedial or devel- 
opmental education and within levels of basic skills courts. A placement 
test should discnminate best among students mth low profidendes. 

A good placement test is a power test Speed should not be an impor- 
tant factor. Time limits are appropriate onl> for administrative purposes. 
The mle of thumb ,s diat 100 percent of the students should complete at 
least 75 percent of the items, and 90 percent of the students should attempt 
all the Items. 

The reliability of a test can be defined as the likelihood that a student 
will achieve the same score if the student takes the test again. (The assump- 
uon IS that the student recdves no treatment between admim" "rations ) Test- 
ret<st -.d spht-half reliability are the methods most often used. R-iliability 
coefficient should be at least ,90. (Kuder-Richardson -20 coeffidmts are 
inflated by the length of the test and specdcdness.) 

The validity of a test can be defined as the likelihood diat the test in 
tact measures what it is supposed to measure. The validity of a test indudes 
Its face v^idity (the degree to whidi the test looks as if it measures what it is 
supposed to measure), concurrent validity (the test s relationship to other 
similar tests), and predictive validity (the degree to whidi the test predicts or 
correlates mth some criterion, sudi as course grades). The predictive validity 
of placement tests is difficult to judge, because correlarions between place- 
ment test scores and grades in a remedial or development course that func- 
uons well should approach zero. 

Guessing, an error factor in multiple-choice tests and in most other 
kinds of tests, imperils the accuracy of placement dedsions. Because guessing 
tan only inflate scores, some tests compensate for it by including a factor 
that systemaucally towers scores. The effects of random guessing can be 
hmited by increasing the number of dioices (four or five are considerably 
better than two) and by directing students accoidingly. 

Every placement test should have an equivalent alternate form that 
can be used both for retesting when necessary and for po^uesting. 
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Cost is the last variable that needs to be considered The cost of a test 
includes the cost of materials, administration, and scoring. Placement tests 
should be able to be scored both by machine and by hand. 

Using Tests in Placement Decisions 

The term cut scores refers to a test score that is used to differentiate 
student performance for the purpose of making placement decisions. Smce 
multiple levels of developmental education should be employed at most 
colleges multiple levels of cut scores should also be determined. In fact, 
since no one score is sufficient for making decisions, it would be more 
accurate to speak instead of cut ranges. 

The traditional method of establishing cut scores is to correlate test 
scores with grades. This method necessitates placing virtually all students 
in college-level courses at least initially in order to collect the data needed 
for the statistical analysis. Of course, this is probably not appropriate, since 
many of the students who need developmental courses would (or should) 
perform poorly if placed direcUy in college-level courses. The pnce of high 
failure rates to establish a statistically based system of cut scores in un- 
acceptable to most people. , • 

The following steps offer a practical method of sctung placement 
cutoff ranges that are methodologically sound and that do not increase the 
probability of student failure: First, selea a task force or committee of faculty 
and appropriate administrators. Make judgments about the test scores on 
the placement test that are needed for a determination of proficiency. (Past 
cut scores or national norms can be used at first until more infomiauon is 
collected.) Second, assume three Lvels of proficiency for each skills area: the 
level of those who clearly do not need remediation, the level of those who 
clearly need remediation, and the level of those in the large grey area 
between Uiesc two extremes. It is in this middle area that other factors 
beyond the placement test scores become inatasingly important Third, in 
systems when^ levels of remediation exist, establish similar cut score rariges 
for each level offered Fourth, use this system of cut score naiges to place 
students in developmental courses. Fifth, after two to four weeks, collect 
ratine from course instructors about the success of the placement deasions. 
Ensi^ that ^acuity members have rated students on profiaency and not on 
other areas, such as class attendance, participation, or atutude. Instructors 
sl-ould make these ratings without knowing the students' placement test 
scores Sixth use the information provided by the faculty ratmgs to adjust 
the cut scored. Change student placements where appropriate and feasible, 
but be conservative. 

The importance of establishing grey areas cannot be overstated, lesis 
are not perfect, and single scores on one test are considerably less than per- 
fect Accurate and reliable placement decisions can be made only if muluple 
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factors are used At the minimum, seven faaors should be considered: place- 
ment test scores, other available test information, high school djia, other 
background data, age, student opinion, and results of additional testing. 

Both placement test scores and the consistency of placement test xxms 
should be considered Scores that fall well above cm- well below the cut range 
have a relatively high probability of being accurate and should weigh more 
heavily than scores that fall in the grey middle area. Similarly, consistent 
scores (fcH* example, a low essay score combined with a low multiple<hoice 
writing soMe) are probably more accurate than conflicting scores. 

The ether available test information that should figure in placement 
decisions can include SAT or ACT scores and scores from any other tests, 
mduding in<lass tests and diagnostic tests, that have been administered 
Decision makers should look for consistent patterns in the student's test 
scores. 

Information about the school attended, number and kinds of courses 
taken, and high school rank can be helpful. However, there is little consis- 
tency in the data obtained from different schools and even bom different 
courses within the same school. 

The other background data that should be considered include such 
factors as years since high school, jobs and work activities, financial situa- 
tion, and cxuacurricular activities. As a general rule, the more responsibili- 
ties and difficulties a student faces in his or her perronal life, the greater the 
hkelihood that the student will require develo;>mental education, a refatively 
light course load, or both. 

Age is a relevant factor in pfacement decisions in the following way: 
Older students lend to be more fearful, more cautious, and m<»e motivated 
Thus, everything else being equal, older students probably have a better 
chance of success in college courses than younger students. 

Student opinion becomes a relevant factor in placement decisions 
only when other factors are confusing, contradictory, or inconclusive. Many 
students, especially recent high school graduates, tend to overestimate their 
abilities. 

Additional testing can help to clarify conflicting information from 
other sources. Retest results should only be used in the context supplied by 
the other data. Dfagnostic testing should be used only to identify specific 
skills areas, not to reverse placement decisions. 

Pnw and Cons of Suiewide Pfacement Testing 

A growing number of states either have initiated (for example. New 
Jersey, Tennessee, Fbrida) or are now considering (for example, Texas, 
Gcorgfa, California) mandatory basic skiUs placement testing for all students 
entering public coll^ systems. What are the advantages and and disadvan- 
tages of a statewide rffort in this area? 
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The Southern Regional Eduouon Board (SREB, 1986) surveyed the 
placement tests and cut scores used by colleges in its fifteen-state region. It 
found that more than a hundred different tests were used and diat th^ cut 
scores ranged from a low of die first percentile to a high of die ninety- 
fourth percentile. How can standards be comparable in die face of such 
divagence? 

It could be argued diat such differences exemplify die variety of die 
missions of die American higher education system. But, does this rationale 
for diversity hold when we attempt to define die basic skills of die students 
who enter college? Should diere be a floor, a minimum standard in basic 
skills proficiency that every college should demand for its college-level 
courses? While die answer to diis question does not necessarily lead to a 
statewide measure, a statewide lest would make it necessary to reach some 
agreement bodi about what should be measured and at what level. The 
establishment of a state standard or at least of a floor leads to an under- 
standing of die meaning of proficiency, to die setting of a minimum stan- 
dard Of course, die £act diat institutions have different missions can and 
should allow for die establishment of cut scores higher than the minimum. 

There is an additional concern about basing standards only on a 
local or individual institution that can be described as the norm-referenced 
phenomenon, namely die tendency to set standards according to die profi- 
ciencies of die students who come to die institution. This tendency can 
jeopardize bodi quality and standards when a college sets its cut scores at a 
predetermined level based on some a priori percentage of die number of 
students who should or can be accommodated in developmental or reme- 
dial courses (for exan^)le, one quarter or one diird). The use of a statewide 
standard helps faculty to select criteria according to what profi.-iency in 
basic skills is judged to be, regardless of die college in which a student 
enroVi^ or of die profidendes of entering students at diat school. This allows 
the program to be adjusted accoitiing to die needs of the students, not of the 
stanoards. 

Feedback to d« high schools is the diird important reason for estab- 
lishing a statewide testing program. Only if there is a standardized statewide 
examination for all entering freshmen can meaningful information on die 
profidendes of graduating students be sent to die high schools of die state. 
It is impossible to interpret die results of differing tests diat use differing 
levels of profidency and content in any meaningful way. It is unlikely diat 
anydiing can be more jDowerful in diis regard dian die results of a statewide 
test of basic skills profidency. 

Decreases in costs, inaeases in communication (widiin colleges, 
aaoss different colleges and sections of higher education, and between K-12 
and postsecondary education), and data for reform are all important vari- 
ables diat support die need for statewide testing. While statewide basic skills 
testing is not necessar/ for effective course placement, it provides a powerful 
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mechanism for establishing educational standards as well as a strong catalyst 
for rdorm. ' 

Coiidusion 

Placement testing is an essential ingredient of a successtul college 
program. The diversity of background and proficiency that students bring 
to our colleges demands individual attention and course selection. To dump 
everyone in the same level of course is significandy to inaease die probabil- 
ity eidier of lowering standards or of failing many students. The test that is 
selected and die cut scores diat are used play important roles in access 
retenuon. and quality. CoUeges need to place as much emphasis on die careful 
selection of a placement test as diey do on curriculum development and 
student recruitment Any college diat does not recognize die interaction will 
pay a high price, and so will its students. 
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Accommodating testing situations to disabled students presents 
special challenges for the administration and interpretation 
of test results. This chapter provides some background 
information on the testing of disabled students and presents 
results from a recent survey of efforts in California to deal 
with this issue. 



Accommodating Testing 
to Disabled Students 

Emmett Casey 



The community colleges fact a critical juncture during tlie 1980s. The 
preceding two decades were periods of tremendous growth and expansion 
for postsecondary education. However, higher education is now experi- 
encing enrollment declines, budget restrictions, and competition for stu- 
dents. In an effort to maintain open access, community colleges accept all 
the students they caa Recent studies indicate that persons with disabilities 
of college age are attending postsecondary institutions in inaeasing 
numbers (Black, 1982). Wliile continuing to make college attractive and 
accessible, community colleges also want to provide the opportunity for 
success. To accomplish these goals of access and success, more assessment 
of potential students, including students with disabilities, is taking place. 

G)mmunity colleges are using considerably more testing for admis- 
sions, placement, and related academic activities than they did in the past 
(Woods, 1985). The administration of such tests has an impaa on all stu- 
dents, but it may have a significant impact on students with disabilities. 
Because rptich of the testing is new, few data are available on what tests 
are being given and on whether and how testing is being accommodated 
to the needs of students with disabilities. 

Section 504 of the 1973 P.ehabilitation Act requires that testing be 
adapted for disabled students so that it measures what it is designed to 
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measurc while allowing for the student's disability. The prevailing philos* 
ophy among the people who work with disabled students and among 
disabled students themselves is that academic standards must be main- 
tained while appropriate accommodations in test administration are made. 
The attitudes among faculty, administrators, and students as well as the 
general public can range from the position that disabled students should 
have to take tests under the same conditions as other students to demon- 
strate that they belong in school to the position that disabled students 
should not have to take tests at all. seems likely that there is a valid 
middle ground somewhere between these extremes. 

The literature that followed passage of Section 504 of the 1973 Reha- 
bilitation Act focused on ensuring the rights of disabled students and 
reinforced the need for testing accommodation (Fedeial Register, 1980). 
Yet, the literature has litde to say about how postsecondary education can 
accommodate disabled students in the area of testing. An Educational 
Resources Information Center (ERIC) search using the descriptors disabil- 
ities, postsecondary education, and admissions turned up five articles. The 
descriptors disabilities, postsecondary education, and student recruitment 
yielded sixteen articles, and the descriptors disabilities, postsecondary edu- 
cation, and college entrance examinations yielded only one. 

The Office of Civil Rights published a guide for activities that 
would assist in compdying with Section 504. The section relating to admis- 
sion tests states: "Some of the questions and issues raised by testing have 
not been resolved in a manner that will allow useful guidelines at this 
time ' (Redden, Levering, and DiQuinzio, 1978, p. 21). In 1981, the Asso- 
ciation of Handicapfied Student Service Programs in Postsecondary Edu- 
cation (AHSSPPE) ^nsored a conference on the accessible institution of 
higher education. Questions regarding the validation of alternative tests, 
concerns about the identification and accommodation of learning disabil 
ities, and issues of standardized tests were addressed. It was noted that 
thei'e are no "fully developed test modifications suitable for all handi- 
capped individuals, nor is there information about the comp>arability of 
available tests for the handicapped'' (Sherman, 1981, p. 68). 

The lack of information and knowledge extends from the profes- 
sionals in the field to disabled persons as well. Ragosta (1981) examined 
how disabled students perceived the SAT with its modifications. Her find- 
ings revealed that few disabled students were even aware of the possibility 
of special administrations of standardized tests. 

Test Validity and Accommodation 

Testing the handicapped leads to a quandary from which there are 
few avenues of escape. Most ot the tests used for admission to college have 
norms and standardized procedures. When special accommodations based 
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on disability alter the standardized procedure, the validity of the test may 
be called into question. However, if the standardized procedure is followed, 
the learning potential or achievement of the disabled person may be under- 
estimated 

In some instances, tests may be waived for disabled students because 
of this problem. For example, a law passed in Massachusetts in 1983 freed 
high school students with dyslexia and other language learning disabilities 
irom having to take aptitude tests in order to gain admission to state col- 
leges and universities. In instances where accommodations are made for 
the disabled students, the results are "red flagged" to indicate that proce- 
dures other than the standardized ones were used, for ^^ample, that the 
time allowed for completing the test was extended This practice could 
tend to draw attention to the disabled student, and it may be discrimina- 
tory. It alsc makes the results difficult to interpret 

The solution is not much clearer if tesdng is to be continued One 
possible way of resolving the quandary is to use the same tests but to 
adapt the procedures in a standardized fashion. Separate norms for the 
disabled would then be used to interpret test scores. The alternative is 
totally separate tests based on disability. 

The type of disability would dictate the possible accommodation. 
Students who are legally blind or who have serious vision problems may 
require taped tests, large-print tests, tests in braille, or persons to read the 
tests and record the students' responses. These students may require a 
special setting or equipment so that the testing mode would not distract 
other students taking tests. However, problems arise if part of the exam 
requires students to interpret printed charts and graphs, which are difficult 
to describe verbally. Mathematics may also be difficult to accommodate in 
this mode. 

Deaf students may require test instructions ro be given in sign lan- 
guage but be expected to read the exam and answer the questions. In such 
a situation, a deaf student with an English language deficiency might 
score lower than he or she would if the test had been administered com- 
pletely in sign language. Deaf students may do much better in the mathe- 
matics component if the problems are not word problems but 
computations and calculations. 

Two large national testing services. Educational Testing Seri^ice 
(ETS) and the American College Testing Prograir (ACT), are interested in 
the issue of testing disabled students. Studies of admissions testing and 
disabled individuals have been undertaken by the College Board, the Grad- 
uate Record Examinations Board, and ETS, and two reports have resulted 
(Bennett and Ragosta, 1985; Bennett, Ragosta, and Strieker, 1985). The 
authors found considerable disagreement in the field of special education 
about the definitions of particular disabilities, especially about learning 
disabilities. This disagreement causes serious problems for researchers. 
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In addition, few disabled are administered standardized admissions 
tests, such as the Scholastic Aptitude Test (SAT). In the 1982-83 school 
year, 4.2 million students-^approximately 10 percent of the entire public 
school population— were classified as handicapped by the nation's ele- 
mentary and secondary schools. Yet, only approximately 6,000 of the 1.5 
million students who took the SAT requested special administration. The 
overwhelming majority (4,300) of those who requested special administra- 
tion were learning disabled. Why were the handicapped so underrepre- 
sented? Is it a problem with definition, or is it merely lack of knowledge 
that special a<itninistrations are available? Perhaps few handicapped are 
considering further education, or perh.^i> they are admitted to colleges 
that waive test requirements. Further research is needed. 

However, despite the definitional problems and the small numbers, 
the admissions testing surveyed by Bennett and Ragosta (1985) and Ben- 
nett, Ragosta, and Strieker (1985) indicates that students with physical or 
visual disabilities performed similar to, but at a level slightly lower than, 
nondisabled peers. Learning-disabled students performed at levels signif i- 
candy below those of nondisabled peers. Students with hearing disabilities 
performed least well as a group, and they performed better on mathemat- 
ical measures than they did on verbal ones. Last, students who performed 
poorly on admissions tests did poorly in college, and students who per- 
formed well on admissions tests did well in college, whether they were 
disabled or not. 

California Community Colleges Survey of Testing 
Accommodation for Disabled Students 

California has one of the largest configurations of community col- 
leges in the world, with approximately 1.5 million students. With this 
number, there are aK>roximately 50,000 disabled students or almost 3.5 
percent of the student population. California is also one of the leaders, if 
not the leader, in providing special funding for programs for disabled 
students at the postsecondary levtl For these reasons, it seemed appropri- 
ate to survey what the community colleges in California were doing with 
respect to testing and accommodatiOii l(X students with disabilities. 

Purpose and Scope of Survey. A study was conduaed in order to 
answer the following questions: Are testing accommodations being made 
for disabled students? What accommodations are currently being made 
and for whom? What other accommodations might be made and for 
whom? Are disabled students waived from taking tests, and if so, which 
students? Last, what types of tests are being used for placement? 

Procedure. Figure 1 shows the survey form that was developed to 
elicit answers to the questions just stated (Figure 1 also tabulates the survey 
results.) It was based on a form develqped by the New York University 
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Figure L Survey Form for the California Community Colleges Survey 

of Testing Accommodation for Disabled Students 

Please answer the following questions regarding testing and disabled students on your 
campus. 

1. Does your college cunendy have tesdng for class placement? 

2Z?L Yes 3% No 

2. If yes, does your collegp have special accommodadons for disabled students? 

5§2L Yes No 

3. If your college does not currendy make accommodadons for testing disabled students, 
what accommodations do you think they might makt in the future for tesdng? 

4. Are accommodadons made for classroom exams, such as quizzes, lab exams, oral presen- 
tauons? 

^ Yes No 

5. If yes, please indicate what types of accommodadons are made. Mark all diat apply. 

515- Time limit extended 

21^ Exam administered in a speaal locadon 

M5- Answers recorded in any manner, e.g. typewriter, computer, or tape recorder 
Use of calculator 

Quesuons read or interpreted (sign language) 
221. Exam provided in braille, large print, or on tape 
i^l. Quesuons omitted, credit prorated 

^Odicr. . 

(please specify) 

6. Me disabled students waived from taking tests? 

14% Ye §51- No 

7 If yes, please mark die types of disabled students for whom waivers are granted. Mark all 
that apply 

JLDeaf 
J?L Blind 

Physically disabled 

Specific learning disabled 

Developmentally disabled 

J2L Odier 

(please specify) 

8. What types of placement tesung do you currently use? 

New Jersey Test of Basic Skills (NJTBS) 

ASSET 

M-Odier 

(please specify) 

9. In your opinion, on a * of 1 to 5, how important is placement testmg? Please mark 
below 

Very Important Sot Important 

5 4 3 2 1 
20 9 5 1 1 

G)mments. 
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Office for Education of Oiildren with Handicapping Conditions in March 
1982. The form was field-tested by colleagues in the community colleges, 
and their input was used to clarify and refine it further. The form was 
then sent by the Office of Specially Funded Programs of the Community 
Colleges State Chancellor to all 106 community colleges in the state. The 
survey was addressed to deans of students, since it was felt diat most college 
testing programs would fall under their jurisdiction. Recipients were 
instructed to return the completed form to San Diego for processing. 

One hundred and one of the 106 colleges (95 percent) completed 
the survey. One college returned two copies of the form, one completed by 
the dean and one L/ the head of the disabled students program. Their 
responses were different, and both copies of the form were included in the 
analysis. 

Results. Community colleges in California give placement tests to 
their students and provide special accommodations for disabled students. 
Almost all the colleges (97 percent) reported that they were testing for 
class placement, and of these colleges, 98 percent said they had special 
accommodations for disabled students. 

Table 1 shows how the accommodations made in placement testing 
vary by disability. For visual impairment, most respondents extend tiir'* 
limits (85 percent) or administer the exam in a spedai location (89 per- 
cent). Surprisingly, only about two thirds stated that they accommodated 
visual impairments by reading questions or by providing a copy of the 
exam in braille oi large print or a copy recorded on tape. Fewer accom- 
modations are made for those who are physically impaired with motor 
difficulties, although a large percentage receive extended time and special 
locations. Students with specific learning disabilities and hearing impair- 
ments are often accommodated by extending time limits and providing a 
special location, but the incidence of accommodation for these students is 
lower than it is for both visual and physical impairment. 

When the responses of those who said they were willing to make 
accommodations in the future are added to the category of accommoda- 
tions currently being made, we can see a trend toward unanimous 
approval for having colleges accommodate students with disabilities at 
least in some fashioa Administrators are most likely to provide extra time 
and appear least likely to allow the use of a calculator, either currently or 
in the future. Greater leeway * llowing this device might have been 
expected, especially for the leai. ing-disabled students. 

The placement tests used at the colleges where these accommoda- 
tions are being made are typically the College Board Comparative Guid- 
ance and Placement Program (CGP) and the American College Testing 
Program*s ASSET for reading and writing. About 50 percent of the respon- 
dents used one of these measures in reading, and 47 percent did so for 
writing. In math, 25 percent reported using one of these two tests, while 
another 21 percent used a locally developed test. 
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Table L Altcmative Testing Techniques Used for Disabled Students 



Time Limit Exam Answers Recorded Use of a Questions Read Exam Copy 

Extended Administered on Tape Recorder, Calculator or Interpreted by Provided ir. 

in a Dictaphone, Allowed Sign Language Braille or Large 
Special Location Typewriter Print or on Tape 



Student Disability/ 

Learner 

Chaiactenstics 


Currently 
Done 


Possible 

m 
Future 


Currently 
Done 


Possible 

m 
Future 


Currently 
Done 


Possible 

in 
Future 


y^un CTiiiy 

Done 


Possible 

in 
Future 


Currently 
Done 


Possible 

in 
Future 


Currently 
Done 


Possible 

m 
Future 


Visual impairment 
Physical impairment 
with motor difficulties 


85% 
82% 


1% 
9% 


89% 
80% 


4% 
6% 


45% 
44% 


19% 
18% 


29% 
24% 


11% 
8% 


66% 
25% 


6% 
4% 


64% 
12% 


22% 
1% 


Health impairment 
Specific learning 
disabilities 


60% 
73% 


8% 
8% 


60% 
74% 


10% 
6% 


26% 
38% 


13% 
20% 


18% 
28% 


10% 
12% 


18% 
53% 


5% 
5% 


12% 
28% 


1% 
13% 


Hearing impaired 
with language 
difficulties 


69% 


8% 


65% 


9% 


13% 


11% 


14% 


8% 


64% 


13% 


9% 


5% 


Speech impairment 


35% 


7% 


35% 


10% 


14% 


12% 


9% 


7% 


11% 


7% 


7% 


6% 
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To the questicxis of how important placement testing was, approx- 
imately one third of the respondents thought that it was of some impor- 
tance. A very small percent (2 percent) considered it to be of no 
importance. The majority of the respondents did not answer the question. 

Various accommodations are also being used in the classroom to 
test disabled students. In response to the question, Are accommodations 
made for classroom exams, fo: example, quizzes, lab exams, oral presenta- 
tions? 98 percent said yes. To the question. Are disabled students waived 
from taking tests? 85 percent said no. It seems to make sense that waivers 
are not necessary if accommodation is being made. Only a very small 
percentage of the respondents indicated that waivers vere granted for any 
type of disability 

In the classroom, the most frequent accommodation (94 percent) 
was to extend time limits and administer the exam in a special location. 
Reading questions to students or interpreting them in sign language 
occurred more often in the classroom than it did in the standardized testing 
situation. In rank order based on the percentage of responses, the other 
accommodations that were reported were answers recorded in any manner 
(83 percent), exam provided in braille or large print or on tape (75 per- 
cent), use of calculator allowed (46 percent), other (22 percent), questions 
omitted, credit prorated (10 percent). 

What are the implications of the willingness of colleges to accom- 
modate students with disabilities? It appears that the twin goals of access 
and success alluded to earlier for community colleges in California are 
being realized through the effort to accommodate students with 
disabilities. 

Summary and Recommendations 

Testing the growing popiilaucn of disabled students is a difficult 
issue. Solutions that are suitable in all cases have yet to be found. In the 
meantime, disabled students are often tested under a variety of accommo- 
dations. However, the results lack precise meaning whenever comparisons 
are made and predictions are needed. Nevertheless, the following recom- 
mendations can be made for the testing and accommodation to be pro- 
vided for students with disabilities in the future: First, indicators other 
than actual testing— for example, letters from previous teachers indicating 
skill levels and types of accommodation needed for successful completion 
of courses— should be accepted for placement decisions. Second, *'stan- 
dardized'' methods lac the administration of tests should be developed for 
each disability category. This recommendation might mean administering 
tests to the blind via tape recording in a special location or substituting an 
art history class for a visual arts type of class if such a class is required for 
graduation or a diploma. The test would not include the use of graphs or 
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charts. Third, rather than basing placement decisions exclusively on test 
scores, colleges should allow <lisabled students to try a class at what is 
agreed to be the most likely levil of placement If that level is subsequently 
shown to be inappropriate, tl e necessary adjustments can still be made. 
Fourth, practice tests should be provided to give students with disabilities 
an opportunity to improve their performance. Fifth, collaboration between 
K-I2 schools and colleges or continuing education facilities should become 
closer to help disabled students make the transition. Sixth, the use of advi- 
sory groups of disabled persons to review modifications of procedures, 
accommodations, or newly developed tests should inaease. Seventh, dis- 
abled students should become more involved in planning by local state 
departments of rehabilitation. This recommendation may also help with 
the problem of identifying learning-disabled students and determining 
eligibility for learning disabilities services. Last, programs of public aware- 
ness should be increased so that disabled students as well as the general 
public know what aax>mmodations are available. 
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The state of Florida uses several forms of assessment to 
improve the quality of public higher education. 



The Impact of Assessment 
on Minority Access 

Roy E. McTarnaghan 



Assessment in Florida's postsecondary institutions focuses on taking stock 
of student achievement at periodic intervals, improving guidance and place- 
ment for appropriate course experiences, improving feedback to secondary 
schools on college-level performance so that strengths and weaknesses can 
be noted, mcreasing college readiness for applicants from secondary 
schools, improving the likelihood of retention and success in college and 
measuring college-level skills at the end of the second college year. A vari- 
ety of mtervention stra'-gies have been developed, some by way of legisla- 
tive initiauvc ; others were identified in the master plans of the three public 
higher educauon boards: the Postsecon^^ary Education Planning Commis- 
sion, the Board of Regents, and the State Board for Community Colleges 
All groups are committed to quality control and quality improvement. 

Now, nearly ten years after this series of acuons started, evidence is 
beginning to mount that setting reasonable goals, communicating them 
tffecuvely, and giving faculty die responsibility for developing standards 
and assessment techniques have made a positive contribution to quality 
control in higher education. At the same time, a high level of sensitivity 
to the potential for negative impact on minority access has challenged the 
state to improve its record in this rer^. 

A formal series of assessment measures is in place in Florida, both 
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in the public saiools and at the college and university level. These mea- 
sures range from nsquiring elemeniary and secondary school basic skills 
tests and minimum achievement levels in reading, writing, mathematics, 
and application of skills to daily life to u'ghtening graduation require- 
ments, using placement exams, making grade information from college 
available to secondary schools, and measuring achievement at the end of 
the lower-division cote courses in college. These changes did not all occur 
together, nor were they even linked in the original plan. Rather, they 
evolved out of a concern to improve rducation. to regain the public uiist, 
and to recover what had been lost: the idea that a diploma or degree repre- 
sented achievement and mastery, not just time spent 'a cidssroc-ns. The 
discovery that minorty students were less likely to oe in a college prepar- 
atory curriculum, more likely to b? coureded into vocational programs, 
and more likely to be ill-prepared and thus to fail in college degree pro- 
grams was another part of this evolution. The open door looked to many 
minority students like a swinging door, quick in and quick out. Florida's 
assessment programs have been designed to be useful, helpful, and sup- 
portive of the educadonal process. The mandated programs have been 
designed to specify objectives, see that students know what is expected, use 
assessment to evaluate readiness, provide periodic feedback, and certify 
achievement at spedfkd levels. Questions will always br raised about the 
level of achievement or peiiormancc that is selected, but urocedure? arc m 
place to monitor and to recon.mend changes as nteded 

In order to support improvement in educational programs and sty- 
dent achievement and to assure uhai assessment is used ronsuiictively to 
increase minority access, state? need to bu:M a data base that enables them 
to observe how assessment is being used, how changes are made, and what 
data are available for applications, admissions, enrollment, atuition. reten- 
tion, and degrees earned. A feedback loop i necessary to evaluate present 
plans and to adjust them in order to build on areas of success and elimi- 
nate problem areas. It must be clear that the improvement of minority 
access is an integral part of any assessn ;nt program. The Florida legisla- 
ture has funded a number of assessment programs, and it and the slate 
board of education, together with the State Board for Community Colleges 
and the Board of Regents, require regular reporting. 

Historical Development 

Florida's public system of higher education has been characteriwd 
since 1965 by a formal uansfer arrangement between two-year community 
colleges and four-year universities. The community colleges have been 
primarily open access, while access to the universities has been limited 
both by admission standards and by a pre-esublished enrollment plan. In 
this environment, of every hundred students enrolled over the last ten 
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years as entering freshmen in public higher education, seventy-eight have 
entered a community college, and twenty-two have entered a university. 

The formal articulation agreement between the two sectors piovides 
for transfer to the junior yeai in the university system for any student who 
completes the associate of arts degree at one of Florida's twenty-eight com- 
munity colleges. The core of general education is accepted in this transfer 
as a package, and the individual courses in the degree program are not an 
issue. In the context described here, assessment in the community colleges 
had for many years focused on guidance and placement for the entering 
student, while in the university it was generally thought of as part of the 
admissions process. 

Core academic high school units that were part of graduation 
requirements when the 1965 articulation agreement was signed were 
eroded when the state minimum standards were phased out and replaced 
by local district guidelines. During the 1970s, the number of college pre- 
paratory courses taken by graduating high school seniors dropped signifi- 
cantly, and the public expressed concern over the perceived quality of the 
high school diploma. Without imposing course requirements, the legisla- 
ture began in 1976 to impose assessment tests to measure basic skills 
among those qualifying for graduation. A state-developed test, the Florida 
Twelfth-Grade Test, had been used for many years in combination with 
high school performance to predict the student's college performance for 
entry into the state university system. Allegations of discriminatory use of 
this instrument and charges that the test was racially biased led the Florida 
legislature to stop funding the pro'T^m. 

Admissions to State Universities 

Against this background, validation studies were conducted in the 
state university system using secondary school performance and nauonally 
normed admissions test. Analysis of entering freshman applicants between 
1978 and 1980 showed that fewer than one quarter had completed what 
had been cc isidered a college preparatory program some fifteen years ear- 
ner. Further, black students appeared tc be placed in non-college prepara- 
tory courses in such large numbers that no more than 10 percent were in 
the additional sequence geared for college. 

Conventional studies of efforts to predict college success in the enter- 
ing year had shown that the core academic courses were generally a better 
predictor than an admissions test Florida studies in the period around 
1980 continued to show that the tendency prevailed for white students and 
that it was h ss predictive fur Hi'-panic and black students. This analysis 
suggeste ' u.at the higher correlativ n betv;een the admissions test and 
achieved grade point average in college conld be due in part to the fact 
that large number^; of minority students enrolled in non-college prepara- 
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tory courses. A review of several thousand high school transcripts in 1980 
for admission to the state university system confirmed that minority stu- 
dents had been exposed on the average to one to two units less in mathe- 
matics and science than majority students had While the differences in 
English and in the social sciences were not great, placement appeared to 
be made between and among sections to focus on college-bound and non- 
college-bound students; minorities were more numerous among the non- 
coUege-bound groups. 

The result was that the Board of Regents of the state university 
system endorsed increased admissions standards in 1981. The inaeased 
sundards called for higher score levels on the two nationally normed 
admissions tests as well as increases in the number and type of college 
preparatory courses; the course requirements were to rise in three phases— 
1981, 1984, and 1986. The regents also encouraged close counseling and 
advisement ties between higher education and public schools so as to 
encourage minorities to emoll in courses and programs that would help 
them to succeed in college. Florida had secured an agreement with the 
United States OfHre for Qvil Rights in 1978 on a plan aimed at inaeasing 
minority partidi>ation in postsecondary education, and the two-year and 
four-year colleger, were linked in the effort. What effect would raising 
standards have on -he challenge to increase the numbers? An important 
provision of the admissions policy for the university system was to provide 
for exceptions as needed in order to meet minority enrollmert goals. As 
the policy was carried out, special support services were developed at the 
institutional level to provide reinforcement for less well-prepared students. 

Coi!e):^Levcl Academic Skills Test 

In the early i980s, the Florida legislature mandated the develop- 
ment of an assessment program called the College-Level Academic Skills 
Test (CLAST). This program, which involved community college and 
university faculty in the computation and communication areas, specified 
college sophomore- level competencies computation, reading, writing, 
and essay By 1984, statewide standards were in place as a factor in qualify- 
ing for the associate in arts degree or for moving to the upper division in 
? state university. The same standards must be achieved for the bachelor's 
degree. The cutoff scores for these standards were increased in 1984, and 
1986, and they are to inaease again in 1989. 

Increasing High School Graduation Requirements 

In 1983, the Florida legislature mandated inaeased high school 
graduation requirements, similar to the university system admission stan- 
dards of 1981, for all high school graduates. The requirements were to 
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become effective in 1987. By that act. the legislature completed a full circle 
in the area of mandated graduation requirements since the state's spedfieu 
standards had been withdrawn some years earlier. 

During the discussion about increasing graduation requirements 
die concern was expressed that diis action might reduce minority *!nroll' 
mem in higher education and cause Florida's abeady low ranking in sec- 
ondary school persistence rates between nindi grade and graduation to 
drop even more. To assist in the transition to college, a series of four 
instruments was authorized for use in the two-year and four-year institu- 
uons for the purpose of guidance and placement Minimum cutoff scores 
were set Students admitted who scored below those levels were required to 
enroll in a noncredit acUvity in either communication or computation. 
The students enrolled in noncredit work would be funded as part of the 
community college mission, not as part of the university mission. Univer- 
sity students so enroUed would normally be instructed by an area commu- 
luty college, sometimes on the university campus by contract arrangement 

What Have Been the Results? 



The evidence that acaimulated between the ) 978-79 and 1984-85 
school years shows that the persistence rates from ninth grade through 
graduation remained constant at 54 percent for black students and that 
they rose from 57 percent to 64 percent for Hispanic students. While 
Oiere was an increase in the proportion of blacks who entered postsecon- 
dary education in Florida's public institutions between 1978 and 1980, 
die numbers have 'eveled off and in some cases declined. The proportion 
of Hispanics who entered postsecondary education has condnued to rise 
since 1978. 

An analysis by Florida Board of Regents st^Ji ir. 1982 and 1983 
showed diut die largest cause of die decline in bl?.ck enrollment in post- 
secondary education direcUy from secondary schwls was heavy military 
recruiting that offered options for later education benefits. While the 
male-female breakout among most racial gioups seldom exceeded 54 per- 
cent-46 percent, black enrollment in die state university system for enter- 
ing students was nearly 65 percent female. During die early 1980s, die 
leveling off of black enrollment in most of die southern states occurred in 
open-access as well as in selective admissions institudons, bodi two-year 
and four-year. Florida's experience widi assessment does not seem to have 
reduced access for minorides. 

A review of changes in CLAST scores since the first administradon 
in October 1982 shows diat passing rates for blacks increased 38 percentage 
points to 72.6 percent, Hispanics increased 37 percentagf points to 90.4 
percent, and whites increased 13 percentage points to 93.1 percent. 

At Florida A. fc M. University, die state's uadiuonally black insd- 
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tuiion that still has a large majority of black students, the June 1986 pass- 
ing rate on all five subtests of the CLAST was 85.5 percent This figure 
can be compared with passing rates of 33 percent in June 1983. 46 percent 
in June 1984. and 52.2 percent in June 1985. Early in this process. Florida 
A. & M. focused additional resources and support programs at the lower- 
division computation and communication levels. The school reports that 
this investment is paying off in student achie f vmt 

A review of the inaeased high school • raduadon requirements 
showed that in 1983. 63 percent of blacks worn meet the 1986 English 
requirements. By 1985. that proportion had riy^n to 90 percent In 1983. 45 
percent of blacks would have met the 1986 mathematics requirements. By 
1985. diat proportion had risen to 87 percent Similar gains occurred for 
Hispanic and white students, although they were not as dramatic. 

Retention in College 

If 1979 is used as the base year for first-time-in-college entering 
students, the university system is showing improved retention. In the four- 
year period diat ended in 1983. the two-year rate of retention for the largest 
minority population groups was as follows: Black students improved from 
60.2 percent to 73.6 percent, and Hispanic students improved from 70.9 
percent to 81.4 percent Longer-range studies are continuing. It appears 
that the opportunity for special counseling services and a more regularized 
advisement program may be as effective in this process as the precoUege 
curriculum experiences. 



Engineering: A Target Area 

Engineering had the smallest share of minority enrollment, partic- 
ularly black. As a result of a five-year plan to expand and improve this 
discipline in Florida, a special commitment was made to counsel and 
recruit more minorities. Evidence for the 1978-1980 period showed few 
blacks being counseled into engineering in Florida, either at the high 
sch^l or college level. Precolkge e>periences in the math and science 
areas were often minor.. 

In fall 1980. 542 blacks were enrolled in engineering programs in 
the state university system. By fall 1985. that number had risen to 826. a 
gain of 52.4 percent. In fall 1980. 573 Hispanics were enrolled in engineer- 
ing programs. By fall 1985. that number had risen to 1.285. a gain of 124.2 
percent These impressive gains were accompanied by a major state com- 
mitment for new facilities, equipment, and faculty and by an overall enroll- 
ment growth that totaled 55.8 percent for the system in engineering. 
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Conclusion 

0:*e of the concerns that led to Flcnida's assessment programs was 
loss of credibility in the link between instruction and credentialing. The 
analysis thus far of the several components of assessment indicates that 
quality control and credibility are being restored. Most of the goals of 
Florida's assessment plan appear to be cvi target in 1986. High school 
graduates exit with more college-preparatory course v/ork than they did in 
the past, and there have been score gains in the past two years among 
those students on both of the nationally normed college admissions tests. 
Dramatic gains in college enrollment are occurring for Hispanic students 
in postsecondary programs, while black enrollment tends to remain fairly 
level. Retention is up in college programs, CIAST scores show improve- 
ment, and target programs, such as engineering, have seen dramatic gains 
in minority enrollment 

When assessment is used with discretion and good planning, it can 
be a useful tool to help minorities to succeed in postsecondary education. 
Of course, while Florida can point with pride to some achievement, much 
remains to be done. Exemplary programs that have produced results need 
to be expanded. Changes in policies that have the effect of restricting 
access, such as changes in financial aid policies, and class schedules that 
are inconvenient for part-timers may need to be adjusted. Success will 
come over many years of diligent effort and commitment. 



Roy E, McTamaghan is vice-chancellor of the State University 
System of Florida, 
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Rapidly changing technology will have a dramatic impact on 
assessment of students both for placement and instruction. 
An exciting potential for increased individualization is 
available if we but choose to use it. 



Technology and Testing: 
What Is Around the Corner? 

Jeanine C. Rounds, Martha J. Kanter, 
Marlene Blumin 



We are now on the verge of a technological revolution in 
testing. Paradoxically, the new testing is, in a sense, a return 
to old-fashioned individualized examinations. . . . Now, how- 
ever, the arbitrariness and lack of objectivity of such exams 
will have been remained [Wainer, 1983, p. 16]. 

Whether this optimistic prediction will become true remains to be seen. 
However, in recent years, as assessment at college has made a major resur- 
gence, schools are looking increasingly toward technology to help with 
the process of administering, scoring, and even interpreting the results of 
.assessment activities. As the number of students to be tested has grown 
and as the level of the information requested has risen, the computer and 
computer-related technology have become essential components of testing 
programs. 'Tie speed, depth, and breadth of the data that they make avail- 
able and their ability to synthesize these data with other information that 
may be available have already 'ishered in a new period of testing. Along 
with technological change, advances in the field of cognitive science, 
particularly in infonmation processing, offer possibilities for new -^nd 
exciting applications to testing. As a result, testing is being linked to 
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improvement of instruction and to student retention and learning out- 
comes as well as to initial placement. As the technology continues to 
improve and as our ability to collect akid interpret information inaeases, 
we can only hope that the result will in fact be an emphasis on individual 
qualities. 

Many of the C2f>abihties that once seemed to lie in the distant future 
are now available, and odiers soon will be. For example, we are becoming 
remarkably more efficient in data synthesis and analysis. Immediate and 
individual feedback is available on many campuses. In addition, the com- 
puter-adaptive test is already in use at a few locations. Computer-adaptive 
tests free as:;essment from the constraints of the timed test that adversely 
affect many test takers. Diagnosis of individual academic skills is now 
available, as is analysis of physical skills. Assessment tasks that use simu- 
lation or interactive videodiscs are also coming to the market Such tests 
will provide more realistic assessment tasks in many areas. Regular mea- 
surement of learning outcomes will identify efficient learrung modes for 
individual students and have an impact both on curriculum and on instruc- 
tional delivery. Yet another impact in the near future will be the use of tlie 
computer to analyze relatively subjective areas, such as writing. The oppor- 
tunities are limitless. The issue of key interest to educators is the use to 
which the technology will be put. 

Pretest Use of 0>mpmers 

One major way in which computers arc currently being used is for 
test preparation. Software is being developed to prepare students for exams 
and even to provide simulated versions of the tests. The test preparation 
software now available includes materials for the high school proficiency 
(G.E.D.) exam, the Scholastic Aptitude Test (SAT), and the American Col- 
lege Testing Program (ACT) exam. Four years ago, Silverman and Dunn 
(1983) reviewed ten programs developed just to prepare students for the 
SAT. In summer If 86, two forms of software to practice the Graduate 
Management Adirission Test (GMAT) became available, one that provided 
immediate item-by-item feedback and one that simulated the actual test 
Practice software for the Graduate Record Examination (GRE) was ofk'ered 
in fall 1986. Ward (1984) notes that one important benefit of such software 
may be motivational with students finding it more entertaining lo attack 
review and drill at the computer than on paper. A second value may be 
utilization of the computer to monitor the student's performance, because 
the computer can track the student's use of time, branch between practice 
and instruction, reintroduce questions that prove troublesome, and in 
short provide considerably more individualization than is usually available 
in the classroom. 
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Computer- Adaptive Placement Tests 

Tests are also being developed to be taken directly at the computer. 
The new placement tests are among diose of greatest interest. In some 
instances, uraditional tests have simply been transferred to machines, but a 
more recent development is the computer-adaptive test, which has different 
questions for different test takers. In such tests, the difficulty level of each 
succeeding question depends on whether the student answers the previous 
question correctly. Such a test begins to capitalize on the capabilities avail- 
able with a computer. 

Moving toward extensive use of the computer. Educational Testing 
Service (ETS) completed the pilot-testing of its computer-adaptive place- 
ment battery (Computerized Placement Test) in 1986 and subsequently 
made the test available for purchase. The three modules offered include 
written communication, learning skills, and mathematics. The student 
takes the test at the computer, responding to questions through an easily 
learned response mode. If the student's answer is correct, the computer 
provides a more difficult question. If the student's answer is incorrect, the 
computer asks an easier question, thus testing at the student's instructional 
rather than at the student's frustration level. This format, which makes 
use of a data bank of 120 questions for eadi test area, requires each student 
to answer between twelve and seventeen questions before the student's 
ability level can be determined with accuracy (Forehand, 1986). 

ACT is also offering computerized assessment. It is designing new 
components for its computer -adaptive testing, and it has plans to link 
skills testing with its vocational assessment and career-planning package, 
Discover. A pilot study is under way at Phoenix College in Maricopa 
District, Arizona, where 1(X) computer terminals are being used for college 
entrance testing (Papparella, 1986). 

Adaptive testing require<j a large item bank; each item must be 
scaled according to its difficulty. The computer stores the items, calculates 
their selection, and facilitates test administration. Adaptive testing is made 
possible by an advaiKe in measurement theory known as itfm response 
theory, which provides a mathematical basis for selection of the appropri- 
ate question at each point and for computation of scores that are compat- 
ible between individuals. Item response theory has been tne subject of 
intensive theoretical and empirical research for thirt> 2ars, but its demand- 
ing computational requirements have prevented it from being feasible for 
use in microcomputer testing until recently (Lord, 1980). 

Traditional norm-referenced testing usually offers a large number 
of moderately difficult questions with a few very easy questions and a few 
very difficult questions. In order to discriminate ability levels, everyone 
who is fasted is asked to answer all the questions. Computer-adaptive 
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testing can obtain the same results by asking only a few questions. How- 
ever, such testing requires extensive research and data to develop the ques- 
tion pool and the computational procedures. These are available only in 
powoful computers (including some microcomputers), so the most effec- 
tive use will probably continue to be for professionally developed large- 
scale placement and diagnostic tests. 

According to Wainer (1983), computer-adaptive testing has the fol- 
lowing advantages: Test security is improved; the individual can work at 
his or her c^wn pace, and the speed with which the individual responds 
provides additional assessment information; each examinee stays produc- 
tive, challenged but not discouraged; there are no problems with answer 
sheets, erasures, or response -unbiguity; the test can be scored immediately; 
and immediate feedback is available in the form of various reports. 

The fact that the test is not timed is another benefit, since it takes 
the pressuie off test-anxious or handicapped students. In addition, it min- 
imizes the need for monitoring. Still another advantage is the flexibility 
that it affords: Students can be tested at virtually any time; students who 
register late or otherwise miss mass testing dates and students who need 
test results at a particular moment can be quickly served. Such a test also 
provides an alternative for students who want to challenge the results of 
other tests. 

In addition, according to one school involved in the pilot-testing 
for ETS, students are surprisingly positive about taking the test on the 
computer, even those who have never used a computer before. The testing 
officer admitted that he had been reluctant to use the computer-adaptive 
test but that he was now enthusiastic because of its versatility and because 
of the very positive student response (Rutledge, 1986). 

The disadvantages of computer-adaptive testing include the neces- 
sity of providing every test taker with a computer terminal (thus far, the 
test can be used only on IBM-compatible machines) and the cost of the 
test. As terminals proliferate on campuses, the first problem may become 
less significant, and the co»is may be absoifoed on many campuses thrOw.gh 
student fees. Nevertheless, it seems unlikely that the computer-adaptive 
test will soon completely replace the paper-and-p)encil mass testing now 
in place ai. most colleges. 

Tests Taken at the G>mputer: Other Types 

Many other kinds erf tests are being developed for the computer, 
including academic and vocational assessments and tests for special 
populations. 

Vocational Tests. One area of growing interest is in the field of 
vocational assessment, botli interest and aptitude. A computerized version 
of the Ohio Vocational Interest Survey (OVIS II) is available. The primary 
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advantage is administrative: With computers, the scores are obtained fas- 
ter, the speed and accuracy of administration is greatly enhanced, the 
results are available more quickly, and test security is increased 
(Hambleton, 1984). 

Other tests provide a range of assistance directly to the student. 
For example, such tests as MicroSkills and Sigi Plus begin with self- 
analysis questions and permit the student to narrow the focus down so 
that very specific information can be obtained directly from the computer. 
MicroSkills askc the student to identify the skills that he or she most 
wants to continue to use and provides a list of the occupations and 
industries that best match the student's interests. Sigi Plus integrates the 
skills, interests, and values that the student has identified into job recom- 
mendations. 

MESA and Apticom, two vocational batteries, measure both aca- 
demic and manual skills as well as interests. Students use a joystick to 
take the MESA test, and the facility with which they use it becomes part of 
their dexterity measure. Apticom makes use of a "probe" that the student 
inserts into answer spots on a large card. The data that are recorded 
include an assessment of eye, hand, and foot coordination and other phys- 
ical abilities based on speed and accuracy. These skills, along with the 
s udent's recorded preferences and answers to math and language ques- 
uons, are combined into a comprehensive report that makes recommenda- 
tions, using the Dictionary of Occupational Titles, about the vocational 
choices that seem appropriate. 

These tools are coming into increasing use at community colleges, 
where students, including returning adults, are often confused and uncon- 
fident about their own abilities and appropriate career choices. 

Special Population Tbsts. For some students, computers offer a tre- 
mendous advantage over paper-and-pendl tests. Large-print systems make 
the computer saeen accessible for individuals with poor vision. Sophisti- 
cated screen-reading software, combined with high-level speech synthesiz- 
ers, such as DECtalk, provides computer access for blind individuals. 
Questions and respwises can be presented through headphones, and the 
student can hear what he or she has typed on the saeen. Students with 
mild to profound orthopedic disabilities can access the computer through 
a variety of adaptadons, including smart word processors, speech recogni- 
tion systems, and programs to modify keyboard functions. Spelling 
checkers, combined with smart word processors and speech output devices, 
create a new and effective writing environment for students with learning 
disabilities. A variety of modalities can be used to offer input through 
visual channels, auditory channels, or both. Other features of computerized 
testing, including enlarged print, auditory feedback, word-by-word reading 
and leview, varying saeen colors, and expanded time frames, have benefits 
for learning-disabled students. 
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Diagnostic Asscssiii<ent and Instruction 

Diagnostic assessment, which can be used after initial assessment 
both as a progress measure and as an outcome measure, is another area of 
rapidly growing interest. 

Diagnostic-prescriptive computer-adaptive test series are currently 
under development by ETS and ACT (Forehand, 1986; Papparella, 1986). 
These tests are intended primarily for the classroom or for learning centers 
after the student has been initially screened For example, a student may 
fail the English placement test, but with what specifically is the student 
having problems? Would it help for a teacher in a remedial math class to 
know the specific areas in which each student was weak or to have a class 
profile of students' strengths and weaknesses? Would it help a student 
who was doing poorly in school to assess his or her study skills? 

Both ETS and ACT are betting that the answer is yes to all these 
questions. At ET3, thirty prototype tests currently under development 
cover basic and advanced math, grammar, writing, reading, and study 
skills. Each test is highly interactive. Features include computer-generated 
narrative reports, feedback and second try when appropriate, spedal-pur- 
pose response modes, an analysis of why mistakes were made (based on 
the branching that probes beneath the correctness or incorrectness of 
responses), and instructional suggestions. Although the tests were concep- 
tualized, for use at the community college level, interest in the materials is 
high among those who have worked with them, including professionals 
from both the high school and the university levels. Seventy-one percent 
of the students who took part in the field-testing indicated that they pre- 
ferred to take a test by computer, while only 16 percent indicated no pref- 
erence (Forehand, 1986). 

Linking Assessment and Instruction 

Computer technology has increased our ability to draw assessment 
and instruction activities close together. Research and increasing knowl- 
edge about cognitive processes, combined with diagnostic assessment, will 
have a major impact on instruction. For example, studies to examine the 
use of language in the cognitive process (Chaffee, 1985) and the student's 
cognitive approach to a discipline (Chi, Feltovich, and Glaser, 1981; Stem- 
berg, 1981) have been undertaken. These efforts to examine the cognitive 
process help us to understand the interaction between examinee and 
machine and the strategies that a learner uses to acquire knowledge. Addi- 
tional research, in cognitive science in particular, will be valuable for 
increasing the interrelationship between assessment and instruction 
(Glaser, 1985; Hunt, 1985; Madaus, 1985). 

The future may well see extensive classroom use of the computer 
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diagnostic test, with the interactive computer maintaining a record of 
every student's performance and tntcking errors to identify patterns and 
problems. On-screen feedback v^U be immediate, acknowledging correct 
answers and rectifying incorrect answers suggesting instructional mate- 
rials that can correct die errors. As Ward (1984, p. 18) comments, "Identifi- 
c?tior. of errors widi diis level of predsicm offers die possibility of specific 
remcdiadon, and die statement of error leads direcUy to a prescripdon for 

die necessary instruction These types of analysis may eventually lead 

to a new generadon of assessment instruments that can be linked more 
direcUy to instrucdonal sequences dian ane present tesu. Because of die 
complexity of die models and die applicadon to die analysis of a given 
student's performance, die computer will be an indispensable tool." 

The use of assessment for outco.ne measurement was die subjea of 
an August 1M6 symposium in Laffina Beach, California. Participants— 
college pracdtioncrs from various states— agreed diat assessment will 
become increasingly differendated in terms of die concepts and capabilides 
assessed and diat it will condnue to expand as one product of student 
consumerism. Pardcipants agreed diat such assessment has a formadve 
funcdon and diat it should have an impact on curriculum and pi\>grams 
radier than serve as a gate that keeps students from progressing (Bray, 1986). 
Again, die quesdons of cost and computer availability may be significant. 

Scoring Tests and Generating Dau 

One odier key area in which technology is moving quickly is in 
scoring tesu and sorting data. In die past, technology has been most often 
ded to die speed of scoring, widi machines used to sort, analyze, and even 
comment on die resulu. The Scanuon machine, which "reads" die pencil 
marks on special muldple-choice answer sheets fed into die machine and 
mdicates which marics are incorrea, is readily available to many classroom 
teachers. 

However, by linking die machine direcdy to a computer, insdtu- 
dons have become able to tie machine scoring to a number of odier uses. 
As placement tests are ^ d by Scantron, die results can be evaluated and 
entered direcdy into uic udents' files, which substantially reduces die 
dme needed for entering data and correcting errors. When necessary, die 
computer can provide the student, die institution, or bodi widi an imme- 
diate printout of die analysis. Typical of die new programs is die software 
now available dirough a group of educators in Santa Barbara, California 
(Computerized Assessment and Placement Programs or CAPP), which links 
widi Scanuon and soores die selected test; determines placement; generates 
reports for counselors, teachers, students, and adminisuators; and prinU 
an individualized letter and mailing label for each student (Brady and 
Elmore, 1986). 
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Major testing companies, such as American College Testing, CTB 
McGraw-Hill, Educational Testing Service, and the College Board, also 
offer services that score entrance and placement exams, relate the data to 
information about other students who have taken the test on a particular 
campus or to national norms, and provide comprehensive feedback in the 
form of scores, mteiprctations, and predictions related to specific pro- 
grams. Validity studies and data analysis by eihnidty, age, sex, and a host 
of other variables are becoming increasingly routine. Data available from 
the companies just named have become increasingly detailed as the com- 
panies have competed to meet the assessment needs of college admissions 
and placement programs. 

For example, ASSET, ACT's program for community colleges, 
incorporates a comprehensive orienution, testing, and research package. 
The res<*pjch provides accountability, placement, and retention informa* 
tion and includes an ability profile report for students in specific programs 
as well as a grade experience table that correlates test results to course 
grades so that a college can develop its own local placement norms. ACT 
has recently added software to ASSET, 

The Placement Research Service (PRS) offered by the College Board 
allows an institution to select up to nine different predictors: Seven differ- 
ent tests, two optional predictors (such as high school grades, teacher rec- 
ommendations, and so forth), and seven different measures of academic 
success (such as grade point average, grades in English classes, grades in 
math classes, and faculty ratings) are available. The data provided to the 
institution include the score disuibutions, correlations of all predictors, 
two-way tables of observed data, and expectancy tables. 

Information for Students 

One impact of the growing emphasis on assessment and informa- 
tion collection has been a movement toward providing students with 
inaeasingly complete information, a sort of consumer awareness move- 
ment that is a far remove from the days when students* results were a 
carefully guarded secret held close to the chest by counselors while they 
gave students the benefit of their professional analysis. 

Increasingly, colleges with sophisticated computer systems are devel- 
oping their own institution-specific programs that report test results directly 
to the student, providing scores, statistical interpretations, and commentary 
or advice in different degrees of formality or friendliness. A 1983 study of 
the four California community college assessment programs diat were con- 
sidered most effective by their colleagues found that one of the few com- 
monalities among the four was the prescriptive computer printout that 
students were given. Comments ranged from a fairly impersonal listing of 
scores and recommended classes to a chatty form that addressed the students 
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by ihcir first names and made various suggestions, such as dropping in and 
visiting a specific pcraon in a tutorial program (Rounds, 1985). Such reports 
are given to students individually or in group settings where college staff 
review particular re^nses and help students with hmher interpretation. 
The reports are considered cost-effective, and they can be used to supple- 
ment or even replace individual meetings with counselors. 

There is also a growing interest in "expectancy" or "probability" 
tables, such as those provided by both ACT and ETS. Using correlation 
data from previous test scores and grades, such tables estimate the proba- 
bility thaf a student with a specific so^re has of earning a specific grade in 
an identified course. Many counselors consider such a uble to be an effec- 
tive way of guiding student selection. 

The Future 

Many exciting possibilities for the use of computers are already 
being explored, and others lie just around the comer. Test capabilities 
include options that should provide us with a better way of assessing 
each individual. For example, a wider variety of questions is becoming 
possible— including memory testing through successive frames, and, with 
voice synthesizers, spelling tests and tests of the understanding of spoken 
language. 

Advances in technology permit the increased use of graphics and 
animation to simulate the actions and events thpi are the focus of a ques- 
tion. Simulations that permit students to select activities and solutions— 
that simulate a chemisuy experiment or a nursing problem, for example— 
niay be a better way of assessing some skills than the ways we now possess. 
Simulations could rq>lace the long written narratives describing problem- 
solving situations on exams for police and fire fighters. The use of inter- 
active video will open many additional options, including touch screens 
for item respoiise. For example, ACT already has experiments under way 
linking videodisc technology with the Disa ver career-planning package 
to offer real-life presentations to students. Improvements in optical disc 
technology should soon make desktop storage of high-resolution visual 
displays an inexpensive and convenient way of presenting test stimuli 
(Millman, 1984; Hafc, Oakey, Shaw, and Bums, 1985; Ziegler, 1986). 

Another exciting possibility may be analysis of student writing. 
Although such analysis currently seems beyond the range of computers, 
such systems as Bell's Writer's Workbench, IBM's Epistle, and UCLA's 
WANDA program have already made substantial progress in analysis of 
writing samples. All these systems are able to detect a number of errors 
and writing weaknesses and to measure low-order writing attributes. Per- 
haps it is not too much to hope that one day the co.nputer may be able *o 
handle the student essay. 
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As we gain better information about cognitive processes and as the 
speed and efficiency of computer technology increase, we should be able 
to develop measures that test each individual's special skills and knowledge 
and provide die diagnostic information that will be most useful in helping 
students to make effective choices. Ongoing diagnosis will affect selection 
of learning tasks and classroom insuruction. Accuracy and speed will 
improve, and costs should decrease as we capitalize on die special oppor- 
tuniues provided by the computer. 

The possibilities are limidess and exciting. If we are able to main- 
tain humanistic goals for assessro.;nt, dien perhaps Wainer's (1983) opu- 
mism will be vindicated. The focus will be on the quaHtico of the 
individual, and technology will be a wise servant, not a demanding master. 
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Assessment systems need to be designed for new student 
populations--the "new" majority vhc no longer fit the 
traditional profile. In contrast to progioms for full-time 
students who are recent high school graduates, the model 
proposed here features a customized planning information 
sequence tailored to the diversity of today's students. 



Is There Life After College? 
A Customized Assessment 
and Planning Model 

Smart S. Obler, Maureen H. Ramer 



Maria is twenty-five years old. entering college for the first time after a 
series of secretarial jobs following high school graduation. She longs for 
nore stimulating work, having discovered diat she is moie skilled with 
subordinates than her supervisors are. However, she suspects that she will 
need a college degree in order to move forward into more challenging 
positions. 

George has entered college directly from high school, where he 
just barely accumulated enough aedits to graduate. With his buddies, he 
shares a vague sense that "college is good for you," but they have very 
amorphous goals. They also have little family support for delaying full- 
time employment. 

Sherril is thirty-two years old and recently divorced. She has two 
boys, ages three and nine. Although she is very motivated to find fulfilling 
work, she fears that her basic skills will not permit her to compete in the 
job market. She favors the health care field, but she wonders where her 
talents will fit. 

Nguyen, a lormer teacher, is forty years old He has been in this 
country for two years. His language skills are improving, but his factory 
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job wastes his many skills, and his low career Suitus is disturbing at best 
His employer is closing the plant, and Nguyen's technical skills will need 
updating if he is to remain in manufacturing. Understandably, he would 
love to return to teaching. 

These and a large percentage of community college students today 
no longfi fit the traditional profile of the recent high school graduate 
who plans to get an A.A. or B.A. Many assessment and matriculation 
programs are designed for the traditional student. In contrast, today's 
**new'' students need an individualized career assessment and guidance 
process that provides them with the infoiiration and interaction that they 
need in order to plan intelligently. 

In spite of the numerous reports on new student populations, there 
is a gap between the awareness of these changes and existing campus 



Figure 1. Assessment and 

Previous Emphases: 
Traditional Community 
College Student 

High-Gchool or G.E.D. graduate; first- 
career oriented 

Curriculum planning: "Major," short- 
range planning, or transfer 

School or college as end In itself 

Youth-oriented guidance counseling 
staff 

School role: internal review of 
available programs based on limited 
information 

Community college role over when 
student oransfers or completes A.A. or 
certificate 

Present-oriented, short-range, one-job» 
narrow skills focus 

Assessment: nanow, skills and 
achievement oriented 

Curriculum designed as foundation for 

further academic study (organization 

centered) 
Assessment occurs once only as a 

review before regisuation or as an 

orientation process 
All students follow same assessment 

process 



Couiiscling Paradigms 

Emerging Emphases: 
New Community 
College Student 

Adult student; first career and career 

redirection 
Curriculum planning: long-range career 

development, professional paths 

College training as means to goal 
Adult-oriented career assessment staff 

College role: extemal review of 
planning and decision making based 
on expanded infoiination 

Community college role con inues to 
assist in recurring career decisions 

Future-oriented, aoss-career skills, focus 
emphasizing problem solving, 
communication, critical thinking 

Assessment: broad, value added, and 
]30tential oriented 

Curriculum designed to provide adults 
with workplace skills and growth 
(student centered) 

Initial assessment forms baseline used to 
monitor subsequent progress; follow- 
up occurs regularly 

Customized process focuses on 
individual students 
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assessment programs. The lack of appropriate services continues to stymie 
student success. Most of the new students are adults, and the goals of 
many are vague. At the same time, college personnel have scrambled to 
survive n.enadng budget cuts and declining enrollment. Their energies 
have been distracted from the assessment needs of these new students, and 
funding for new programs has been extremely limited. The expanded 
assessment model prqx)sed here is consistent with he emerging paradigm 
that emphasizes the "new" adult student. 

Shifting Paradigms: From Prescribing to Empowering 

Due to the history of the community college movement, assessment 
and counseling services were once modeled after secondary school 
approaches. The goals of assessment and guidance were somewhat binary: 
college or noncollege, transfer or nontransfer. Students were then advised 
on class schedules for available curricula. With such a narrow focus, assess- 
ment serves the college programs more than it does the students, and the 
curriculum becomes an end in itself rattier than the meaus to a goal 
(Garza, 1986). Such goal displacement and constricted options can threaten 
students' motivation. That is, if assessment systems communicate limited, 
short-range purposes, students will perceive assessment in the same dead- 
end way. 

These changes in perception and approach appear as paradigm 
shifts in Figure 1. The old, narrow system designed for the u^aditional 
student is moving toward a broad, diversified model that serves the needs 
of the new student. 

The Assessment Model in Action 

The broad assessment model proposed here— it is depicted in Figure 
2— is based on four assertions: First, students will succeed more readily 
with dear goals. Second, most students intend to pursue a career after col- 
lege. Third, many adults require help with career redirection. Fourth, com- 
munity colleges should be the primary community resource for career 
redirection. The goal of this model is to enable students to define their 
personal goals and to plan an instructional program as quickly as possible. 

Every student begins with an interview that is conducted by a pro- 
fessional career counselor. The counselor obtains a profile of the student's 
formal education (A). If the student has a defined career goal, he or she 
will only require assessment of the basic skill competendes directly related 
to the objective. The student then proceeds to step (F) in order to develop 
an academic plan. However, most practitioners recognize that students 
who do not have a clearly defined career goal need to proceed through 
several steps in the process. 
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Figure 2« Model for Career Assessment and Educational Planning 
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For example, Maiia needs only part of the model due lo her work 
history. Following the intake interview, she receives a plan for tesu in 
career aptitude and personality and interest inventories for professional 
level positions (B). She and the career specialist consider and interpret the 
resulu (C). She finds that she is deuil oriented and well suited to fiscal 
management careers. She agrees with the outcome of the testing, so she 
does not need the directed career research (E). With her counselor, she 
develops an academic plan (H): an A, A, degree in accounting with electives 
in business management She enrolls in college (I). 

George discusses his limited high school record at the initial inter- 
view (A). Since he has little work experience, he and his counselor decide 
that he should take the full range of tesu: basic skills, career testing, apti- 
tude, and so on (B). At the test results interview (C), George's interest in 
art emerges undeniably. Following additional tesu (D) to determine his 
occupational focus, he conducu directed career research (E) on the require- 
menu in the various commercial art fields. With these data, George reviews 
his alternatives in another interview (F), and he decides to enter commer- 
cial art. Unfortunately, his college does not have diis program, so he is 
referred to a neighbcning college tliat does. 

Shenil discusses her lack of confidence in communication skills and 
receives a plan for basic skills tesu, interest inventories related to the health 
care field, and aptitude testing (B). After these tesu, she meeu with the career 
counselor to review her resulu (C). Her test resulu indicate a strong interest 
in the field of respirator^' therapy. To find out more, she pursues directed 
career resear^*^ (E). After reviewing all her information widi a counselor (F), 
she develops an academic plan (H) and enrolls in college (I). 

At his intake interview, Nguyen discusses his desire to return to the 
teaching he loved in his native land. Si»ice his goal is clearly defined, his 
tesu are primarily limited to academic skills (B). After the career specialist 
interprets his resulu (C), Nguyen conducu career research to determine 
the requiremenu for a teaching aedential in the sute (E). Tb^ information 
is reviewed (F), and the curriculum plan that is developed (H) includes 
written and oral language skills and the lower-division course work 
required for a teaching aedential. 

The means for gathering data that can be used to evaluate the prog- 
ress both of individuals and of groups is built into this system. One of the 
goals of the procoss is to retain studenu by helping them to define their 
goals. The individual data and subsequent evaluation (K) are the means 
for measuring the success of this outcome. The overall group data and 
subsequent evaluation (J) are a means of measuring the success of the 
system to inaease the retention of studenu. 

The Strength of the Mod^l 

The model described here has many strengths and advantages. Fiist, 
the assessment and interpretation procedures are completely customized to 
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each student; this feature communicates the college's willingness to deal 
with the needs and abilities of individuals. Further, the student's personal 
involvement helps tl;e student to "own" his or her goals and increases the 
student's motivation. The student also has a full report and discussion of 
his or her sttengths and liabilities. Because the student and the counselor 
are in contaa at every step, the evolving exchange incorporates old data 
into new. 

Another advantage of this modei is the documented and profes- 
sionally reviewed educational plan that the student receives. The spiral of 
acti' Ities permits student and counselor to expand data and readdress goals 
as often as needed These branched steps provide the time and the infor- 
mation needed for planning the most direct route to the student's goal 
The more direct the student's route to his or her goals, the more the *tu- 
de'^t's persistence increases. 

Admittedly, the thorough, customized process eiivisioned in this 
model requires careful planning and budgeting. Yet, on balance, the pro- 
gram could save the college revenue that is ordinarily lost through the 
aturiiion of students who have ambiguous goals. One way of generating 
funds for this kind of assessment system is to offer it as a variable-unit, 
open-entry "course." Colleges could also use the resources in federally 
funded job training and vocational education programs for this purpose. 
At the least, externa! lunding could offset the start-up costs for tests and 
personnel. Furthei, colleges could charge fees to nonenrollees from the 
community. 

Thus, die model helps colleges to fill the perilous gaps between 
test resulu and a suident's future. As Loacker, Cromwell, and O'Brien 
(1986, p. 48) have written, "Testing, as it is frequently practiced, can tell 
us how much and what kind of knowledge someone possesses, whereas 
assessment provides a basis for inferring what the person can do with that 
knowledge . . Assessment aims to elicit a demonstration of the nature, 
extent, and quality of his or her ability in action." It is in this broader 
spirit of assessment, not in the narrow use of testing, that the model 
desaibed here can empower the nonuaditional student. Colleges must 
once again focus their mission on the student's future and provide the 
powerful infonration needed to realize and improve life after college. 
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Materials abstracUd pom recent additions to the Educational 
Resources Infcrmatian Center (ERIC) system provide further 
informauon on student assessment at community colleges. 



Sources and Information: 
Student Assessment 
at Community Colleges 

Jim Palmer 



Student assessment and placement programs pose several educational and 
logistical problems for community college administrators: Who should be 
assessed and when? What tests should be used, and how will cutoff scores 
be determined? Should remediation be mandatory for those whose test 
scores fall below the cutoff point How does the testing program comple- 
ment other student services, such as advising and counseling? These ques- 
tions are addressed in a growing body of literature on assessment practices 
at two-year colleges. Selections from this literature reviewed here include 
descriptions of institutional assessment programs, college efforts to evalu- 
ate testing programs and assess the predictive validity of testing instru- 
menu, state initiati\« in testing (with particular emphasis on Florida's 
College-Level Academic S'cills Test), and the use ol cohort terting to assess 
curricular efficacy. 

Descriptions of Testing Programs 

During the early 1980s, growing interest in student assessment led 
researchers to survey assessment and placement practices at community 

D Bny. ind M J Bekhct (ftk; lum m SnMlnu Aaammu. 
^ N#wOi«TO0M lor CooimumiyCofctw.no. 59 SwFiincMwJoM^^ IQJ 

^ 106 



104 



colleges. The resulting literature includes descriptions of institutional 
assessment programs at Sacramento Qty College in California (Haase and 
Caffiey, 198S), the Grossmont Community College District in California 
(Wiener, 1984^), and Triton College in Illinois (Chand, 1985). A number 
of sutewide analyses have appeared, including Ramey (1981), who exam- 
ines the procrjures used by Florida community colleges in 1980 to assess 
the skills of entering students; Rivera (1981a, 1981b), who dcsoibcs English 
placement systems at community colleges in California and Arizona; For- 
stall (1984), who reviews the approaches to student assessment and place- 
ment used by the Illinois community colleges; Rounds (1984) and Rounds 
and Andersen (1984), who examine placement practices in the California 
community colleges; and the Washington Slate Student Services Commis- 
sion (1985), which outlines the components of model assessment programs 
in place at the Washington community cdleges. The information in these 
reports cannot be considered current, because practices in the area of assess- 
ment and placement change continuously. Nonetheless, they point to the 
diversity of assessment practices employed and emphasize that the colleges 
differ greatly in terms of the subject areas assessed, the assessment instru- 
ments used, and the ways in which the results of assessment arc used to 
advise and place students. A composite picture of community college assess- 
ment practices is not easy to draw. 

Most of the studies just named show that assessment efforts serve 
primarily as a sorting function for entering students. While this function 
serves the useful purpose of identifying students whose skills deficiencies 
jeopardiic their chances of completing college-level courses successfully, 
some authors have pointed out that assessment information is more effec- 
tively used in the context of student flow. For example, Walveka'' (1982) 
urges a three-stage approach to evaluation: assessment of skills at entrance, 
ongoing assessment of students during their college career to determine 
whether instructional programs need to be modified in order to meet stu- 
dent needs, and follow-up evaluation to document student learning on 
program or course completion. Cohen (19S1-S5) aii^ces that assessment 
should be viewed as part of an overall student retention effort, not simply 
as an initial placement mechanism. He draws on the literature to show 
how student orienution, tutorial activities, and other supplemental sup- 
port services complement entry testing in an ovemll retenaon program 
that starts with reauiunent and ends with follow-up ?iCtivitie$. finally. 
Bray (1986) ui jes educators to link assessment outcomes with instructional 
improvement and student retention by using test results as a guide to 
course development and student services. She illusuates how this can be 
done by describing the student flow model at Sacramento City College 
(California) and the assessment and placement model developed by the 
Learning, Assessment, and Retention Consortium of the California com- 
munity colleges. 
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Evaluating Student Assessment Programs 

Do assessment and placement programs improve student academic: 
performance and persistence? A few colleges have used quasi^perimental 
designs to assess the academic performance of students who followed the 
placement prescriptions generated by assessment procedures. The result 
arc mixed, reflecting the difficulty of drawing causal relationships betw^icn 
assessment and academic performance. 

Among those attributing positive effects to student assessment are 
Boggs (1984), Borst and Cordrey (1984), and Richards (1986). Boggs (1984) 
compares the ovcraU grade point aver?gc (GPA) of students in English 
classes at Butte College (California) before and after implementation of 
the college's literacy skills assessment program. He determined that, while 
the high school GR\s of entering studcnu did not significantly change 
after implementation of the assessment program, the college CPAs of the 
students did. Borst and Cordrey (1984) compare the cumulative CPAs 
earned over three semesters by two groups of students at Fullerton College 
(California): those who tested poorly in reading or writing skills and sub- 
sequently underwent remediation and those who tested poorly but avoided 
placement in remedial classes. The students undergoing remediation 
earned higher CPAs, which led the authors to suggest that the chances of 
academic success increase if students follow assessment prescriptions. 
Richards (1936) conducted a similar analysis, comparing the academic 
success and persistence of Colorado community college students who fol- 
lowed assessment prescriptions regarding course placement with the suc- 
cess and persistence of those who did not Tht former tended to succeed at 
a significantly higher rate than the latter, but in a small number of cases 
those who did not follow the advice of counselors succeeded nonetheless. 

Losak and Morris (1983) have also documented the phenomenon of 
successful students who do not follow placement prescriptions. They sug- 
gest that a student's deliberate decision not to enroll in remedial courses 
despite poor test scores may in some cases be appropriate. The authors 
base this position on an examination of the retention and graduation 
rates of students who entered Miami-Dade Community College (Florida) 
in fall 1980. More than half of the entrants whose basic skills test scores 
indicated a need for remediation chose not to participate in remedial 
classes. It is interesting that the retention and graduation rates of these 
studenu were as high as or higher than the retention and graduation rates 
of studenu '.vho did take remedial classes. 

Friedlander's (1984) evaluation of the Student Orientation, iVssess- 
ment. Advisement, and Retention program (SOAAR) at Napa Valley Col- 
lege (California) also suggests that assessment and placement services may 
not always be effective. SOAAR was designed to assess entering students* 
reading skills, advise students with low assessment scores to enroll in reme- 
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dial courses, and inform students of college services. But, Friedlander com- 
pared SOAAR students to a similar group of students enrolled before 
implementation of the SOAAR program and found the SOAAR students 
were actually less likely to complete courses and earn passing grades. He 
also found diat test scores did not predirt student success accurately and 
that SOAAR did not increase students' use of supplemental support ser- 
vices. Among other recommendations, Friedlander (1984, p. 4) proposes 
that "assessment of students' skills should go beyond reading and arith- 
metic ability to include study skills and, if possible, attitude toward 
learning." 

Assessment Validity 

The literature is also concerned with the predictive validity of the 
tesLng instruments used in assessment programs. Several documents 
describe college efforts to correlate subsequent student grades with scores 
on various tests, including the Differential Aptitude Tests (Digby, 1986); 
the College Board's Descriptive Tests of Language Skills (Rasor and 
Powell, 1984); the American College Testing Program's Assessment of Stu- 
dent Skills for Entry and Transfer (Abbott, 1986; Santa Rosa Junior Col- 
lege, 1984; Roberts, 1986); the College Board's Multiple Assessment 
Programs and Services (Abbott, i986); the English Qualifying Exam (Bea- 
vers, 1983); the Nelson Denny Reading Tests (Loucks, 1985); and the Com- 
parative Guidance and Placement Program's tests of reading and written 
English expression (Miami-Dade Community College, 1985). Most of these 
studies find only low correlations, if any, between test scores at entrance 
and subsequent student grades, reflecting the fact that variances in instruc- 
tor grading practices make it difficult to predia grade outcomes uniformly. 
For example, Spahr (1983) regiessed the English and algebra grades earned 
by students at Morton College (Illinois) against several independent vari- 
ables and determined that, while placement test scores accounted for about 
15 percent of the variance in student grades, instructor differences 
accounted for about 27 percent. 

Thus, the weight of the evidence shows that the predictive validity 
of entrance tests in terms of subsequent grades is highly questionable. In 
light of this, several authors urge that tests be used with caution. For 
example, Spahr (1983) argues that assessment programs must consider the 
multiple factors that affea academic success in addition to cognitive ability 
in specific skills. This may require colleges, he concludes, to use multiple 
cutoff scores, eliminate entrance testing altogether for certain programs, 
or work with faculty to minimize inconsistencies in grading practices. 
Neault (1984) concuis that there is a need for the cautious application of 
testing and urges colleges to eschew rigid adherence to absolute cutoff 
scores in recognition of the fact that many students are borderline cases. 
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State Testing Initiatives 

In addition to the application in student placement, states also use 
testing as an accountability tool certifying that the students who advance 
through the educational pipeline have mastered reading, wnting, and com- 
putational skills. For example, New Jersey requires entering students in 
the state's public postsecondary institutions to take the New Jersey College 
Basic Skills Placement Test; test results are used to place students needing 
remediation and to monitor changes in the skills abilities of entering stu- 
dents over time (New Jersey Basic Skills Council, 1986). In Georgia, the 
Board of Regents of the state university system requires degree-seeking 
students in public colleges and universities to demonstrate minimum com- 
petencies in reading and writing skills (Bridges, 1986). 

Much of the literature on state-mandated minimum competency 
testing focuses on the tests required for high school graduation or for 
those seeking teacher certification. But, Florida's College-Level Academic 
Skills Test (CLAST), which is required of all students seeking an associate 
in arts degree or upper-division status in the state university system, has 
placed the issue of minimum competency testing squarely within the 
realm of the community college transfer function: Students must pass the 
test in order to attain junior status. Much of the literature on CLAST 
emanates from the Office of Institutional Research at Miami- Dade Conv 
munity College. Drawing on the CLAST scores of Miami-Dade students, 
these reports focus on such topics as the characteristics and educational 
backgrounds of students who fail (Belcher, 1984b, 1986); CLAST out- 
comes for special populations, including those who enter the community 
college with test scores that make them ineligible for the state university 
system (Losak, 1984b; Belcher and Losak, 1985), ethnic minorities 
(Belcher, 1984c), and £nglish-as-a-second-language students (Belcher, 
1985e); the relationship between grades earned at Miami- Dade and subse- 
quent performance on the CLAST (Belcher, 1985a; Losak, 1984a); the 
relationship between a student's level of basic skills at entry and pass-fail 
rate on the CLAST (Belcher, 1984a); the curricular correlates of success 
on CLAST, including the contribution' of developmental, mathematics, 
and English classes to student pass rates (Belcher, 1985b, 1985c, 1985f); 
the effect of increased test-taking time ou CLAST performance (Wright, 
1984a); the question of whether additional attention to test-taking strate- 
gies might significantly improve passing rates (Belcher, 1985d); and stu- 
dents' opinions on the adequacy of their preparation for the CLAST 
(Wright, 1984b\ Tliese reports reveal that those entering the college with 
lower skill levels tend to have a more difficult time passing the CLAST 
exams. In comparison to those who pass all four sections, students who 
fail were more likely to have been in the bottom of the percentile on 
entrance tests, to have listed a language other than English as their first 
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language, to have higher course withdrawal rates, and to earn lower grade 
point averages. Nonetheless, Losak (1984a) points to an imperfect relation 
between academic success and CLAST performance, noting that 20 per- 
cent of the associate degree graduates who took the CLAST in fall 1983 
failed one or more of the CLAST subcomponents. He concludes that 
student grades may not necessarily reflect the competencies requisite to 
successful competition on the CLAST. 

Cohort Testing 

While such tests as the CIAST may satisfy the political need to 
certify student competency in basic skilk, some scholars point out that 
they cannot account for the aggregate of what students learn in college 
courses. For example, Cohen and Brawer (1987) argue that tests required 
of students who move from one grade level to another focus only on the 
most rudimentary skills and drive students toward classes in the basics, 
away from more specialized courses in the arts and sciences. A better 
approach to accountability, Cohen and Brawer argue, is to require criter- 
ion-referenced tests in the liberal arts to be taken periodically by cohorts of 
students as they progress through college. While such tests cannot be used 
to place students in classes or to make decisions about individuals, they 
can be used to measure the value added to student cohorts as a whole from 
year to year. Thus, cohort testing turns the focus of the college assessment 
program from placing students to estimating the efficacy of curriculum 
and instruction as a whole. 

As an example of cdiort testing, Cohen and Brawer (1987) describe 
the General Academic Assessment (GAA) and its administration to 8,026 
students at four large urban community college districts in 1984. The GAA 
is a test of student knowledge in the liberal arts and includes representative 
items in the humanities, sciences, social sciences, mathematics, and English 
usage. Cohen and Brawer determined that there was a direct relationship 
between GAA scores and the number of units completed by students; for 
example, the more humanities courses a student had taken, the higher the 
student's score on the humanities section of the GAA. If appropriate con- 
trols were introduced, Cohen and Brawa argue, colleges could use such 
tests as the GAA in multiple-matrix programs to gain information on stu- 
dent learning and program outcomes that could be sent to state agencies. 
Riley (1984) provides further information on the GAA. 

Conclusion 

This concluding chapter has reviewed the recent literature on stu- 
dent assessment at the community college. Several themes emerge: descrip- 
tive analyses of testing and assessment programs, the problem of 
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incorporating student assessment into ongoing student flow and retention 
programs, the limited predictive validity of placement tests, the use of 
minimum competency testing as an accountability measure, and the alter- 
native use of cohort testing to document student learning. The publica- 
tions dted here by no means constitute the entire body of the student 
literature on student assessment Additional writings can be found through 
manual or computer searches of ERICs Resources m Education and Cur- 
rent Index to Journals in Education. 
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From the Ediion' Notes 



The current liieraiure discusses community colleges as a 
component of postsecoruksry education, subject to the 
same standards as other institutions. This volume of New 
Directions for Gmmiunity Colleges acknowledges that 
we canrwt discuss assessment for community colleges as 
separate from the dialogue on assessment for four-year 
colleges and universittes. In fact, community colleges 
have a particularly urgent mandate to join in the dialogue, 
shape the assessment models, and present their firuiings and 
outcomes to the public. The traditioruil response to calls to 
improve higher education has been to raise entrance 
statulards, and one survey indicates that some states are 
again considering this response Comuiunity colleges are 
open-door institutions. If they are to retain their mission, 
they have the obligation to present other responses to the 
demands fen the accountability through assessment. 
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